Skip to content

Module system#

In NHPCC a module system is used to load many softwares and libraries. This provides users access to multiple versions of an installed software or library. We use Lmod utility on both login and compute nodes to manage modules. Using modules, users can easily load/unload all PATHs and other required Environment variables for a software/library and its dependencies, automatically.

Why do you need a module environment system?

Whenever users login to a Gnu/Linux OS, they get a login shell and the shell uses Environment variables to execute commands and run softwares. Most common are:

  • PATH: list of directories in which the OS looks for executable files;
  • MANPATH: list of directories in which man searches for the man pages;
  • LD_LIBRARY_PATH: list of directories in which the OS looks for *.so shared libraries at runtime needed by softwares.

In addition, there are also application specific environment variables such as CPATH, LIBRARY_PATH, SYSTEM, etc.

A typical way to setup the Environment variables is by customizing the shell initialization files such as: /etc/profile, .bash_profile, and .bashrc. But this could be very hard and practically impossible on an HPC facility which is a multi-user system with many softwares with different versions installed.

It is very hard if not impossible to set and change Environment variables manually, so many HPC centers use Lmod as their Environment Modules managing system. Lmod is a tool that simplify shell initialization and lets users easily modify their environment during the session with modulefiles.

Each modulefile contains the information needed to configure the shell for running a software or using a library. Typically modulefiles tell the module command how to change or set shell environment variables such as PATH, MANPATH, etc.

Modules can be loaded and unloaded dynamically, very cleanly. All popular shells are supported, including bash, ksh, zsh, sh, csh, tcsh, as well as some scripting languages such as python, ruby, tcl, ruby, cmake and R. Modules can also be bundled into metamodules that will load an entire suite of different modules. This is how we manage the NHPCC Software Set. 😉

How to use module system#

Module system supports different commands for working with modulefiles. For more simplicity, the ml alias can be used but with slightly different syntax.

Module names auto-completion

The module command supports auto-completion, so you can just start typing the name of a module, and press Tab to let the shell automatically complete the module name and/or version.

Module command Short version Description
module avail ml av List available software
module spider fftw ml spider fftw Search for particular library/software (here FFTW). Search is done case-insensitively.
module keyword lapack ml key lapack Search for lapack in module names and descriptions
module whatis ScaLAPACK ml whatis ScaLAPACK Display information about the ScaLAPACK module
module help ScaLAPACK ml help ScaLAPACK Display module specific help
module load ScaLAPACK ml ScaLAPACK Load a module to use the associated software
module load CUDA/11.4 ml GCUDA/11.4 Load specific version of a module
module unload CUDA ml -CUDA Unload a module
module swap gcc icc ml -gcc icc Swap a module (unload gcc and replace it with icc)
module purge ml purge Remove all modules
module save foo ml save foo Save the state of all loaded modules in a collection named foo
module restore foo ml restore foo Restore the state of saved modules from the foo collection

Additional module sub-commands are documented in the module help command. For complete reference, please refer to the official Lmod documentation.

Architecture dependent softwares/libraries

Currently, at NHPCC we have two CPU architectures: The new one is AMD EPYC 7542 and the old one is AMD Opteron 6174. We have tried to compile most if not all of the softwares/libraries according to each node architecture in two separate MODULEPATHs:

  • /share/apps/modules/all for the AMD EPYC
  • /opt/share/modules/all for the AMD Opteron

The correct path is applied automatically when users submits their jobs. But at previous steps (e.g. writing, compiling and/or testing a code in the login node) the architecture is important. So, we strongly recommend users to request an appropriate node via an interactive jobs for compiling and testing their softwares.

Loading modules in slurm batch scripts#

If you need to load modules to run your code, you need also to load them when you want to use the slurm batch scripts. You can always run any GNU/Linux command after the last #SBATCH directive. This certainly includes the module command. As an example, let's write a simple FORTRAN code, compile it and then write a batch script to run it.

  • write a simple code, for example:
hello_world.f90
program HelloWorld
  ! Declare variables
  character(len=20) :: message

  ! Initialize variables
  message = "Hello, World!"

  ! Print the message
  print *, message

end program HelloWorld
  • in the login node, load GCC module by running ml GCC. You can check which modules are loaded by ml.
  • compile your code: gfortran hello_world.f90 -o hello_world.exe
  • write a batch script for submitting your job. This could be:
#!/bin/bash
#SBATCH ...
#SBATCH ...
#SBATCH ...

ml purge
ml load GCC

srun ./hello_world.exe ...