The CernVM File System#
Under Construction (Last updated: 22 October 2025)
This documentation page is currently being written and reviewed.
Please note:
- Some sections may be incomplete
- Examples might be missing
- Links could be broken
We appreciate your patience!
The CernVM File System (CernVM-FS or CVMFS) is a specialized, scalable file system heavily used by High-Energy Physics (HEP) experiments and designed for the efficient, global distribution of software and data. Files are available read-only under the /cvmfs
directory and downloaded transparently on-demand as they are accessed, rather than requiring a full installation upfront.
Key operational details and advantages include:
-
Decoupled Architecture: It relies on a standard HTTP web server (like Apache or Nginx) to host the repository, while clients use a lightweight FUSE module to mount the remote repository as a local directory.
-
Optimization for Large-Scale Science: This design is exceptionally well-suited for large scientific collaborations (most famously, the LHC experiments at CERN) where thousands of computers worldwide need identical, often complex, software environments.
-
Efficiency and Performance: Because data is cached locally on the client and shared across all users and processes, repeated access to the same files incurs no additional network overhead. Furthermore, content deduplication ensures that only unique file chunks are stored and transferred.
-
Reliability and Integrity: All content is cryptographically hashed. This guarantees that users receive the exact, unaltered software versions intended for them, providing built-in integrity verification.
CVMFS at NHPCC#
To provide access to CVMFS (CernVM File System), we offer it as a containerized OS application. This allows you to run a compatible environment through Open OnDemand and use your persistent Apptainer overlay for software installations.
Configuration Steps#
The process for using CVMFS is similar to launching any other OS app. The key step is to specify the CVMFS repositories you need in the launch form. As an example, we want to use two repositories sft.cern.ch
and software.eessi.io
:
Repositories:#
-
cvmfs-config.cern.ch
: This is a configuration repository that provides essential settings for many other repositories. Its inclusion is highly recommended and often mandatory. -
Additional Repositories: You can add other repositories as needed for your research software.
Initial Cache Population#
The first time you launch the CVMFS app with a new set of repositories, you must select a node with internet access. This is required to populate the local cache.
Subsequent sessions can often run on internal nodes without internet, provided all necessary files were cached during the first run.
Persistent Image and Cache Warming
As with other OS apps, you must create a persistent image with sufficient space for your needs, within your allocated blkdir directory quota.
To ensure optimal performance, we recommend a process known as cache warming. After your first session, compile and run your code completely. This will download all required libraries and data into the local cache, making future runs faster and more reliable, especially on nodes without direct internet access.
In this example, after launching the session, the three above repositories must be mounted and available at /cvmfs
directory:
Accessing Repositories#
Once your CVMFS session is active, the repositories you specified will be mounted as read-only directories under /cvmfs
. For the example configuration above, you would find the following paths available:
/cvmfs/cvmfs-config.cern.ch # Configuration files for CVMFS
/cvmfs/sft.cern.ch # LCG Software Stacks
/cvmfs/software.eessi.io # Compatible software stacks via EESSI
You can immediately navigate to these paths and access the software and data they contain according to their manual.
Critical: Saving Your Cache#
The CVMFS service automatically maintains a local cache of accessed files to speed up future use. This cache, along with any other system changes in your session, is stored in your persistent Apptainer image.
Prevent data loss
To ensure your CVMFS cache and any other system changes are preserved, you must log out properly. Otherwise, all unsaved system changes (icluding the newly populated CVMFS cache) will be lost. Please also see the OSs manual.
Re-using your cached software#
To benefit from your cached software in a new session, please follow the process described in the Re-using your image at OSs section. Specifically, in the CVMFS app launch form:
- Select your previously created persistent image.
- Crucially, you must specify the exact same list of CVMFS repositories as you did when the image was first created and the cache was warmed. If you add a new repository, you will need to run a new session with internet access to populate its cache.
This workflow ensures that subsequent jobs can run without requiring internet access, as all necessary software is already available in the local cache on your image.
Common Examples of Repositories#
Here is a detailed list of commonly used CVMFS repositories and their primary purposes:
CVMFS Repository Reference#
CERN Repositories:#
Repository | Main Project | Description |
---|---|---|
alice.cern.ch |
ALICE Experiment | Software and data for the ALICE heavy-ion physics experiment. |
ams.cern.ch |
AMS Experiment | Software for the Alpha Magnetic Spectrometer space station detector. |
atlas.cern.ch |
ATLAS Experiment | Primary software repository for the ATLAS collaboration. |
atlas-condb.cern.ch |
ATLAS Conditions | Conditions database for ATLAS detector calibration. |
atlas-nightlies.cern.ch |
ATLAS Development | Nightly builds of ATLAS software. |
cernvm-prod.cern.ch |
CernVM Project | CernVM-FS server and client software. |
cms.cern.ch |
CMS Experiment | Main repository for CMS software (CMSSW). |
grid.cern.ch |
WLCG | Worldwide LHC Computing Grid middleware. |
lhcb.cern.ch |
LHCb Experiment | Primary software for the LHCb experiment. |
lhcb-condb.cern.ch |
LHCb Conditions | LHCb conditions database. |
sft.cern.ch |
CERN SFT | LCG Software Stacks. |
sft-nightlies.cern.ch |
CERN SFT Development | Nightly builds of SFT software. |
unpacked.cern.ch |
CERN | Unpacked container images. See the CVMFS documentation for details. |
Open Science Grid (OSG) Repositories#
Repository | Main Project | Description |
---|---|---|
config-osg.opensciencegrid.org |
OSG Config | OSG CVMFS configuration repository. See the Open Science Grid website for details. |
dunedaq.opensciencegrid.org |
DUNE DAQ | Data Acquisition framework for the Deep Underground Neutrino Experiment (DUNE). |
dune.opensciencegrid.org |
DUNE Experiment | Deep Underground Neutrino Experiment. |
fermilab.opensciencegrid.org |
Fermilab | General-purpose Fermilab software. |
gm2.opensciencegrid.org |
Muon g-2 | Muon g-2 experiment. |
icecube.opensciencegrid.org |
IceCube | IceCube Neutrino Observatory. |
larsoft.opensciencegrid.org |
LArSoft | Liquid Argon TPC software toolkit. See the LArSoft website for details. |
lz.opensciencegrid.org |
LZ Experiment | LUX-ZEPLIN (LZ) dark matter detector. |
minerva.opensciencegrid.org |
MINERvA | MINERvA neutrino-nucleus scattering experiment. |
mu2e.opensciencegrid.org |
Mu2e | Mu2e muon-to-electron conversion experiment. |
nova.opensciencegrid.org |
NOvA Experiment | NuMI Off-Axis electron neutrino appearance experiment. |
oasis.opensciencegrid.org |
OSG OASIS | OSG Application Software Service. |
spt.opensciencegrid.org |
South Pole Telescope | Cosmic microwave background studies. |
uboone.opensciencegrid.org |
MicroBooNE | MicroBooNE liquid argon neutrino experiment. |
Other Collaborations#
Repository | Main Project | Description |
---|---|---|
software.eessi.io |
European Environment for Scientific Software Installations | EESSI provides a common stack of optimized scientific software installations that work on any Linux distribution, and currently supports both x86_64 (AMD/Intel) and aarch64 (Arm 64-bit) systems |
sw.lsst.eu |
Vera Rubin Observatory | Large Synoptic Survey Telescope software |
Example Application#
When configuring your CVMFS session, select the repositories that are relevant to your research field. For example:
For High-Energy Physics:#
- LHC Experiments: Use your collaboration's repository (
atlas.cern.ch
,cms.cern.ch
,alice.cern.ch
,lhcb.cern.ch
) - General software: Include
sft.cern.ch
for compilers and libraries - Configuration: Always include
cvmfs-config.cern.ch
For Neutrino Physics:#
- Long-baseline:
dune.opensciencegrid.org
,nova.opensciencegrid.org
- Short-baseline:
uboone.opensciencegrid.org
,minerva.opensciencegrid.org
- Toolkits:
larsoft.opensciencegrid.org
for LArTPC experiments
For Astrophysics/Cosmology:#
icecube.opensciencegrid.org
(neutrino astronomy)spt.opensciencegrid.org
(cosmic microwave background)sw.lsst.eu
(optical astronomy)lz.opensciencegrid.org
(dark matter searches)
General Scientific Computing:#
sft.cern.ch
- Comprehensive scientific software stackoasis.opensciencegrid.org
- OSG community softwaregrid.cern.ch
- Grid computing tools
You can always add more repositories later by creating a new persistent image, but starting with the correct set will save you time and storage space.