Skip to content

The CernVM File System#

Under Construction (Last updated: 22 October 2025)

This documentation page is currently being written and reviewed.

Please note:

  • Some sections may be incomplete
  • Examples might be missing
  • Links could be broken

We appreciate your patience!

The CernVM File System (CernVM-FS or CVMFS) is a specialized, scalable file system heavily used by High-Energy Physics (HEP) experiments and designed for the efficient, global distribution of software and data. Files are available read-only under the /cvmfs directory and downloaded transparently on-demand as they are accessed, rather than requiring a full installation upfront.

Key operational details and advantages include:

  • Decoupled Architecture: It relies on a standard HTTP web server (like Apache or Nginx) to host the repository, while clients use a lightweight FUSE module to mount the remote repository as a local directory.

  • Optimization for Large-Scale Science: This design is exceptionally well-suited for large scientific collaborations (most famously, the LHC experiments at CERN) where thousands of computers worldwide need identical, often complex, software environments.

  • Efficiency and Performance: Because data is cached locally on the client and shared across all users and processes, repeated access to the same files incurs no additional network overhead. Furthermore, content deduplication ensures that only unique file chunks are stored and transferred.

  • Reliability and Integrity: All content is cryptographically hashed. This guarantees that users receive the exact, unaltered software versions intended for them, providing built-in integrity verification.

CVMFS at NHPCC#

To provide access to CVMFS (CernVM File System), we offer it as a containerized OS application. This allows you to run a compatible environment through Open OnDemand and use your persistent Apptainer overlay for software installations.

Configuration Steps#

The process for using CVMFS is similar to launching any other OS app. The key step is to specify the CVMFS repositories you need in the launch form. As an example, we want to use two repositories sft.cern.ch and software.eessi.io:

cvmfs

Repositories:#

  • cvmfs-config.cern.ch: This is a configuration repository that provides essential settings for many other repositories. Its inclusion is highly recommended and often mandatory.

  • Additional Repositories: You can add other repositories as needed for your research software.

Initial Cache Population#

The first time you launch the CVMFS app with a new set of repositories, you must select a node with internet access. This is required to populate the local cache.

internet-access

Subsequent sessions can often run on internal nodes without internet, provided all necessary files were cached during the first run.

Persistent Image and Cache Warming

As with other OS apps, you must create a persistent image with sufficient space for your needs, within your allocated blkdir directory quota.

To ensure optimal performance, we recommend a process known as cache warming. After your first session, compile and run your code completely. This will download all required libraries and data into the local cache, making future runs faster and more reliable, especially on nodes without direct internet access.

In this example, after launching the session, the three above repositories must be mounted and available at /cvmfs directory:

Accessing Repositories#

Once your CVMFS session is active, the repositories you specified will be mounted as read-only directories under /cvmfs. For the example configuration above, you would find the following paths available:

/cvmfs/cvmfs-config.cern.ch # Configuration files for CVMFS
/cvmfs/sft.cern.ch          # LCG Software Stacks
/cvmfs/software.eessi.io    # Compatible software stacks via EESSI

You can immediately navigate to these paths and access the software and data they contain according to their manual.

Critical: Saving Your Cache#

The CVMFS service automatically maintains a local cache of accessed files to speed up future use. This cache, along with any other system changes in your session, is stored in your persistent Apptainer image.

Prevent data loss

To ensure your CVMFS cache and any other system changes are preserved, you must log out properly. Otherwise, all unsaved system changes (icluding the newly populated CVMFS cache) will be lost. Please also see the OSs manual.

Re-using your cached software#

To benefit from your cached software in a new session, please follow the process described in the Re-using your image at OSs section. Specifically, in the CVMFS app launch form:

  1. Select your previously created persistent image.
  2. Crucially, you must specify the exact same list of CVMFS repositories as you did when the image was first created and the cache was warmed. If you add a new repository, you will need to run a new session with internet access to populate its cache.

This workflow ensures that subsequent jobs can run without requiring internet access, as all necessary software is already available in the local cache on your image.

Common Examples of Repositories#

Here is a detailed list of commonly used CVMFS repositories and their primary purposes:

CVMFS Repository Reference#

CERN Repositories:#

Repository Main Project Description
alice.cern.ch ALICE Experiment Software and data for the ALICE heavy-ion physics experiment.
ams.cern.ch AMS Experiment Software for the Alpha Magnetic Spectrometer space station detector.
atlas.cern.ch ATLAS Experiment Primary software repository for the ATLAS collaboration.
atlas-condb.cern.ch ATLAS Conditions Conditions database for ATLAS detector calibration.
atlas-nightlies.cern.ch ATLAS Development Nightly builds of ATLAS software.
cernvm-prod.cern.ch CernVM Project CernVM-FS server and client software.
cms.cern.ch CMS Experiment Main repository for CMS software (CMSSW).
grid.cern.ch WLCG Worldwide LHC Computing Grid middleware.
lhcb.cern.ch LHCb Experiment Primary software for the LHCb experiment.
lhcb-condb.cern.ch LHCb Conditions LHCb conditions database.
sft.cern.ch CERN SFT LCG Software Stacks.
sft-nightlies.cern.ch CERN SFT Development Nightly builds of SFT software.
unpacked.cern.ch CERN Unpacked container images. See the CVMFS documentation for details.

Open Science Grid (OSG) Repositories#

Repository Main Project Description
config-osg.opensciencegrid.org OSG Config OSG CVMFS configuration repository. See the Open Science Grid website for details.
dunedaq.opensciencegrid.org DUNE DAQ Data Acquisition framework for the Deep Underground Neutrino Experiment (DUNE).
dune.opensciencegrid.org DUNE Experiment Deep Underground Neutrino Experiment.
fermilab.opensciencegrid.org Fermilab General-purpose Fermilab software.
gm2.opensciencegrid.org Muon g-2 Muon g-2 experiment.
icecube.opensciencegrid.org IceCube IceCube Neutrino Observatory.
larsoft.opensciencegrid.org LArSoft Liquid Argon TPC software toolkit. See the LArSoft website for details.
lz.opensciencegrid.org LZ Experiment LUX-ZEPLIN (LZ) dark matter detector.
minerva.opensciencegrid.org MINERvA MINERvA neutrino-nucleus scattering experiment.
mu2e.opensciencegrid.org Mu2e Mu2e muon-to-electron conversion experiment.
nova.opensciencegrid.org NOvA Experiment NuMI Off-Axis electron neutrino appearance experiment.
oasis.opensciencegrid.org OSG OASIS OSG Application Software Service.
spt.opensciencegrid.org South Pole Telescope Cosmic microwave background studies.
uboone.opensciencegrid.org MicroBooNE MicroBooNE liquid argon neutrino experiment.

Other Collaborations#

Repository Main Project Description
software.eessi.io European Environment for Scientific Software Installations EESSI provides a common stack of optimized scientific software installations that work on any Linux distribution, and currently supports both x86_64 (AMD/Intel) and aarch64 (Arm 64-bit) systems
sw.lsst.eu Vera Rubin Observatory Large Synoptic Survey Telescope software

Example Application#

When configuring your CVMFS session, select the repositories that are relevant to your research field. For example:

For High-Energy Physics:#

  • LHC Experiments: Use your collaboration's repository (atlas.cern.ch, cms.cern.ch, alice.cern.ch, lhcb.cern.ch)
  • General software: Include sft.cern.ch for compilers and libraries
  • Configuration: Always include cvmfs-config.cern.ch

For Neutrino Physics:#

  • Long-baseline: dune.opensciencegrid.org, nova.opensciencegrid.org
  • Short-baseline: uboone.opensciencegrid.org, minerva.opensciencegrid.org
  • Toolkits: larsoft.opensciencegrid.org for LArTPC experiments

For Astrophysics/Cosmology:#

  • icecube.opensciencegrid.org (neutrino astronomy)
  • spt.opensciencegrid.org (cosmic microwave background)
  • sw.lsst.eu (optical astronomy)
  • lz.opensciencegrid.org (dark matter searches)

General Scientific Computing:#

  • sft.cern.ch - Comprehensive scientific software stack
  • oasis.opensciencegrid.org - OSG community software
  • grid.cern.ch - Grid computing tools

You can always add more repositories later by creating a new persistent image, but starting with the correct set will save you time and storage space.