Filesystems overview#
The Panthera provides access to three different file systems, each with distinct storage characteristics. Those file systems are shared with other users, and are subject to quota limits and for some of them, purge policies (time-residency limits). The main filesystem on the center of Panthera is BEEGFS which is made up of two pools: the fast pool (the default one, know as wrkdir) for the jobs are high I/O bounded and the bulk pool (known as blkdir) which is suitable for slower jobs. Our storage is currently sized at around 50 Terabytes. This houses both user wrkdir/blkdir directories and it is designed to be scalable as the need arises.
Info
Historically, there has been a long story about the NHPCC filesystems we have used before. At the very early begining of NHPCC (in 2009), there was a Lustre file system as well as GFS for central storage and as time went by, we started to add a newer file system named Gluster onward until 2018. From now and then we decided to reconstruct our storage infrastructure by using BEEGFS, thanks to its cool features to get the most out of it.
Please consider
NHPCC's storage resources are limited and are shared among many users. They are meant to store data and code associated with projects for which you are using Panthera's computational resources. This space is for work actively being computed on with Panthera, and should not be used as a target for backups from other systems.
Home directories#
All users have a home directory mounted under /home/<your-username> (e.g. /home/u111111). This is a networked storage, accessible from all nodes in the cluster and is backed up nightly. This space is deliberately small and limited to 10GB per user. Avoid using this space as a destination for data generated by cluster jobs, as home directories fill up quickly and exceeding your quota will cause your jobs to fail.
Working directory space - $WRKDIR#
Working directory space is used for temporary storage of files. We recommend you use this for data generated by cluster jobs. The jobs will likely run faster due to the higher performance of this system. The wrkdir folder isn’t backed up by the HPC team and hence it’s advisable to place only the temporary files in it.
Bulk directory space - $BLKDIR#
Sometimes, users' jobs might have run out of space ending up with failed results or you have completed your jobs but need an extra space doing your post proccessing tasks. As there is not enough room to continue , you should either remove your old files to free up some space or you can move your input files to a bigger place called bulk directory and start submitting your new job. So the blkdir is the same as wrkdir, but a bit slower and also more suitable for doing post processing jobs.
Research Group storage space - $PROJECT#
Research Groups, Projects or Labs can have storage allocated to them. That is, we have defined a namespace for users planning to work on a shared project as groups. Like home, this is a networked storage called project, accessible under each member's home directory.
Each Research Group can have 10 GB of free shared space for this. Research Groups that have multiple projects, or groups can have multiple shares, however the free 10 GB is only allocated once per Research Group.
Local scratch on nodes - $TMP#
There is temporary space available on the nodes that can be used when you submit a job to the cluster. The size of this type of storage per node is almost 600GB.
As this storage is physically located on the nodes, it cannot be shared between them, but it might provide better performance for I/O intensive tasks than the networked storage.
Tip
We would typically recommend using the workdir system where possible, however there are sometimes edge-cases that perform badly on anything except local storage.
Backups and data retention#
HPC team make nightly backups of home and project directories and the backups are kept for 3 days. Files in home and project will be kept as long as your account is active (having active jobs). If your account remains inactive (having no jobs) for 6 months, the account will get locked and after 30 days of grace period, your files in home and project will be deleted.
HPC team don’t make backups of $TMP, $WRKDIR and $BLKDIR. So the rule above for cleaning home and project goes for these directories as well. Users will get notifications when their account gets locked and one week before deletion.
Quotas and limits#
Quotas are applied on both volume (the amount of data stored in bytes) and inodes: an inode (index node) is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. In practice, each filesystem entry (file, directory, link) counts as an inode.
Entry limits#
Name | Quota type | Volume quota | inode quota | Data retention |
---|---|---|---|---|
$HOME | directory | 10 GB | - | time limited |
$WRKDIR | directory | 100 GB | 30000 | time limited |
$TMP | directory | 600 GB | - | job lifetime |
$BLKDIR | directory | 200 GB | 50000 | time limited |
Retention types#
- time limited: files are kept for a fixed length of time after they've been last modified. Once the limit is reached, files expire and are automatically deleted.
- job lifetime: files are only kept for the duration of the job and are automatically purged when the job ends.
Checking quotas#
Where should I store my files?#
Tip
Choosing the appropriate storage location for your files is an essential step towards making your utilization of the cluster the most efficient possible. It will make your own experience much smoother, yield better performance for your jobs and simulations, and contribute to make Panthera a useful and well-functioning resource for everyone.
Here is where we recommend storing different types of files and data on Panthera:
- personal scripts, configuration files and software installations (like using pip) → $HOME
- group-shared scripts, software installations and medium-sized datasets → $PROJECT
- temporary output of jobs, large checkpoint files → $WRKDIR
- post-proccessing jobs → $BLKDIR
Accessing filesystems#
We strongly recommend using those variables in your scripts rather than explicit paths, to facilitate transition to new systems for instance. By using those environment variables, you'll be sure that your scripts will continue to work even if the underlying filesystem paths change.
To see the contents of these variables, you can use the echo command. For instance, to see the absolute path of your $WRKDIR directory:
$ echo $WRKDIR
/home/u111111/wrkdir
Or for instance, to move to your group-shared home directory:
$ cd $PROJECT
Using $TMP#
There is temporary space available on the nodes that can be used when you submit a job to the cluster.
As this storage is physically located on the nodes, it is not shared between nodes, but it will provide better performance for read/write (I/O) intensive tasks on a single node than networked storages. However, to use the temporary scratch space, you will need to copy files from the networked storage to the temporary scratch space. In addition, if a job fails then any intermediate files created may be lost.
If your job does a lot of I/O operations to large files, it may therefore improve performance to:
- copy files from your home directory into the temporary folder
- run your job in the temporary folder
- copy files back from the temporary folder to your home directory if needed
- delete them from the temporary folder as soon as they're no longer needed
Basic example#
The following job runs a shell-script ./runcode.sh in a data folder beneath a user's wrkdir directory. The data is held on the networked storage at this point.
#!/bin/bash
#SBATCH -n 1
#SBATCH --mem=2G
#SBATCH --time=1:00:00
cd $HOME/wrkdir/project
./runcode.sh
The following job:
- copies data.file from the project directory to the temporary area
- sets the current working directory to the temporary area
- runs the appropriate code
- copies the output file results.data back to the project directory
This is the equivalent of the previous example, but using the temporary storage
#!/bin/bash
#SBATCH -n 1
#SBATCH --mem=2G
#SBATCH --time=1:00:00
# Copy data.file from the project directory to the temporary scratch space
cp $HOME/wrkdir/project/data.file /tmp
# Move into the temporary scratch space where your data now is
cd /tmp
# Do processing - as this is a small shell script, it is run from the network storage
$HOME/wrkdir/project/runcode.sh
# Copy results.data back to the project directory from the temporary scratch space
cp results.data $HOME/wrkdir/project/
If you do not know, or cannot list all the possible output files that you would like to move back to your home directory you can use rsync to only copy changed and new files back at the end of the job. This will save time and avoid unnecessary copying.
The following job:
- copies files to the temporary scratch area
- runs the shell-script ./runcode.sh on the local copy
- copies the results back to networked storage
#!/bin/bash
#SBATCH -n 1
#SBATCH --mem=2G
#SBATCH --time=1:00:00
# Source folder for data
DATADIR=$HOME/wrkdir/project
# Copy data (inc. subfolders) to temporary storage
rsync -rltv $DATADIR/ /tmp/
# Run job from temporary folder
cd /tmp
./runcode.sh
# Copy changed files back
rsync -rltv /tmp/ $DATADIR/
Viewing temporary files#
To view temporary files while the job is running (to ensure the job is correct) you can ssh
to the node.
SSH Connections
As per the Usage Policy SSH sessions on nodes should be limited to monitoring jobs.
Advanced example#
This advanced example uses rsync for speed and will ensure cleanup happens at the end of a job or when the job hits the soft limit.
#!/bin/bash
#SBATCH -n 1
#SBATCH --mem=2G
#SBATCH --time=1:00:00 # Request 1 hour runtime
#SBATCH --signal=B:SIGUSR1@300 # Clean up after 55 minutes
function Cleanup ()
{
trap "" SIGUSR1 EXIT # Disable trap now we're in it
# Clean up task
rsync -rltv $TMP/ $DATADIR/
exit 0
}
DATADIR=$(pwd)
trap Cleanup SIGUSR1 EXIT # Enable trap
cd $TMP
rsync -rltv $DATADIR/ $TMP/
# Job
./runcode.sh
From other systems#
External filesystems cannot be mounted on Panthera
For a variety of security, manageability and technical considerations, we can't mount external filesystems nor data storage systems on Panthera. The recommended approach is to make Panthera's data available on external systems.