Python#

Python is a dynamic very popular programming language. Python is an open source software. It has wide applications in many HPC softwares and related workflow. Python is well known for having a broad libraries and big community around the world.

Python at NHPCC#

Interactive mode#

To use python in its interactive mode, you can simply get an interactive job,

u111111@login1:~> srun -n 1 --mem=4G -p amd128 -t 10 --pty /bin/bash
srun: job 60242 queued and waiting for resources
srun: job 60242 has been allocated resources
u111111@en-1-3:~>

then load the python module and run it:

u111111@en-1-3:~> ml Python
u111111@en-1-3:~> python
Python 3.11.5 (main, Feb 17 2024, 15:35:29) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

To see which python versions are available, write ml Python and press twice Tab. For example,

u111111@en-1-3:~> ml Python
Python                                     Python/3.10.4-GCC-11                       Python/3.10.4-GCCcore-12.2.0               Python/3.7.4-GCCcore-8.3.0
Python/2.7.15                              Python/3.10.4-GCC-11-bare                  Python/3.10.4-GCCcore-12.2.0-bare          Python/3.8.6-GCCcore-10.2.0
Python/2.7.18-GCC-11-bare                  Python/3.10.4-GCCcore-11                   Python/3.10.8-GCCcore-12.2.0               Python/3.9.6-GCCcore-11.2.0
Python/2.7.18-GCCcore-10.2.0               Python/3.10.4-GCCcore-11.3.0               Python/3.10.8-GCCcore-12.2.0-bare          Python/3.9.6-GCCcore-11.2.0-bare
Python/2.7.18-GCCcore-11.2.0-bare          Python/3.10.4-GCCcore-11.3.0-bare          Python/3.11.5-GCCcore-13.2.0               Python-bundle-PyPI
Python/2.7.18-GCCcore-12.2.0-bare          Python/3.10.4-GCCcore-11-bare              Python/3.7.4                               Python-bundle-PyPI/2023.06-GCCcore-12.2.0

See Module section for more info on using the module or its abbreviation ml command. You can always check the software list section for an up to date list of different available python version.

Using Python in batch scripts#

It is also possible to load Python module in a job script. The script could be written as:

#!/bin/bash
#SBATCH ...
#SBATCH ...
#SBATCH ...

ml purge
ml load Python

./hello_world.py

Using Anaconda distribution#

You can use Anaconda python distribution instead of python modules. Anaconda could be more suitable for scientific and data science workflows. To load the default version run:

ml purge
ml Anaconda3

Again, you can use tab completion to see what versions are available. To check the python path, run:

which python

To see all the pre-installed Anaconda packages and their versions use the conda list command:

u111111@en-7-5:~> conda list
# packages in environment at /share/apps/eb/Anaconda3/2024.02-1:
#
# Name                    Version                   Build  Channel
_anaconda_depends         2024.02             py311_mkl_1  
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
abseil-cpp                20211102.0           hd4dd3e8_0  
aiobotocore               2.7.0           py311h06a4308_0  
aiohttp                   3.9.3           py311h5eee18b_0  
aioitertools              0.7.1              pyhd3eb1b0_0  
aiosignal                 1.2.0              pyhd3eb1b0_0  
alabaster                 0.7.12             pyhd3eb1b0_0    
...

The Anaconda Python distribution is system software. This means that you can use any of its packages but you cannot make any modifications to them (such as an upgrade) and you cannot install new ones in their location. You can, however, install whatever packages you want in your home directory in custom environments. The two most popular package managers for installing Python packages are conda and pip. These commands automates the installation process, including resolving dependencies, compilation (only pip and for not pure python code) and copying the files into the correct path.

Conda enables you to easily install complex packages and software. Creating multiple environments enables you to have installations of the same software in different versions or incompatible software collections at once. You can easily share a list of the installed packages with collaborators or colleagues, so they can setup the same environment in a matter of minutes.

Unlike pip, conda serves as both a package and environment manager. It is not limited to a single programming language, supporting packages for Python, R, Fortran, and more. Conda primarily uses the main channel of Anaconda Cloud for installations, but it can also access other channels like bioconda, intel, r, and conda-forge. It always installs pre-compiled binary files, which often offer better performance by utilizing Intel MKL. Below is an example of creating an environment and installing packages in it:

ml purge
ml Anaconda3/2024.02-1
conda create --name myenv <package-1> <package-2> ... <package-N>
conda activate myenv

Installing packages#

Internet access

To install any packages, you need to login to a node with internet access. So, you must submit your job into the short partition. Please note the short partition has 30 minutes time.

pip#

To install python packages using pip, you should utilize the --user option. This ensures that the packages are installed in a user-writable location, which is typically your home directory. As your home directory is shared across nodes on the cluster, you'll only need to install python packages once, and they'll be accessible and ready to use on every node in the cluster. Let's install a very small test package called "pip-install-test":

ml Python/3.10.8-GCCcore-12.2.0
pip install --user pip-install-test

This package will be installed in $HOME/.local/lib/python<version>/site-packages/pip-install-test and can be imported as:

import pip_install_test

To list the installed packages:

pip list -v

You can use a requirements.txt file to install a list of packages:

pip install --user -r requirements.txt

It's possible to upgrade a package:

pip install --user --upgrade <package_name>

or upgrade all the packages listed in the requirement file:

pip install --user --upgrade -r requirements.txt

To uninstall a package:

pip uninstall <package_name>
pip uninstall -r requirements.txt

Using virtual environments#

If you want to install many packages or to make a sophisticated project or more importantly (believe me!) reproduce your work on any other computers at any time (even after all the oceans are evaporated because of conversion of the sun to a red giant!), you can consider using virtual environments. So after loading your desired python version environment module, please do:

mkdir projectA
cd projectA
python -m venv env

When you check the new projectA folder, you will notice that a new folder called env has been created. env is the name of our virtual environment, but feel free to rename it anything you want.

To activate it:

source env/bin/activate

To check your environment working:

(env) u111111@en-7-5:~/projectA> pip list
Package    Version
---------- -------
pip        23.2.1
setuptools 65.5.0

[notice] A new release of pip is available: 23.2.1 -> 24.2
[notice] To update, run: pip install --upgrade pip

Now, you can install packages into this environment simply as before and finally to deactivate it just run

deactivate

Conda#

If you have load any of the Anaconda3 modules, it's recommended to use the conda package manager to install your other needed packages.

As an example let us install the yt package. After loading Anaconda3/2024.02-1, run:

conda create --name myproject yt

After a few seconds, it tell us it want to get and install many packages (~ 77 MB) including python-3.11.9. As the default python version of the loaded Anaconda module is 3.11.7 and the python minor version is not important for installing yt we decide to explicitly determine python version to reduce downloading packages:

conda create --name myproject yt python=3.11.7

Now the download size is decreased to ~ 44 MB. To activate and test:

conda activate myproject
yt --help

Install the environment in a specific path

If you want to install the conda environment in another directory than our home, you can add --prefix PATH. This also enables multiple users of a project to share the conda environment by installing it into their project folder instead of the users home. For example:

mkdir -p your-project-dir/envs/myproject
conda create --prefix your-project-dir/envs/myproject yt python=3.11.7

Mamba#

Mamba is a Python-based CLI conceived as a drop-in replacement for conda, offering higher speed and more reliable environment solutions As an example, we are going to install the astropy package. We first load the Mamba module and then make an environment with an arbitrary name:

ml Mamba
mamba create -n ENV_NAME

Now, we should activate it and then install the desired package(s):

mamba activate ENV_NAME
mamba install -c conda-forge astropy

After installation termination, you can list all the environments:

mamba env list

To remove an environment with all its packages, do:

mamba env remove -n ENV_NAME

You can export an environment, i.e. make a file that contains all the installed packages with their versions:

mamba env export -n ENV_NAME > ENV_NAME.yaml

This file can be used for cloning this environment in the future. This is very helpful for reproducibility of your works.

mamba env create --file ENV_NAME.yaml

Finally, you can deactivate your environment:

mamba deactivate

Working with jupyter lab#

Using the OnDemand interface#

At NHPCC the most straightforward way to use Jupyter lab is to use our nice web interface. Please see OnDemand Interactive Sessions page and the short movie below for more info.

Another harder way#

Submit an interactive job as already mentioned (e.g. srun -n 1 --mem=4G -p short -t 30 --pty /bin/bash) and setup your environment, then in a node that is assigned to you (e.g. en-7-5), run

jupyter lab --no-browser --port=8888

If you see an error that this port number is already used, try another number more or less around that (e.g 8889). This will start jupyter and print a few lines (including an address where the jupyter is running at).

Then on the login node run

ssh -NL 8888:localhost:8888 en-7-5

and in your local computer (one that you used to connect to the login node) run

ssh -NL 8888:localhost:8888 your_username@login.hpc.iut.ac.ir

Finally open your web browser and go to the address where the jupyter is running , e.g.

http://localhost:8888/?token=276ba92d6b9834c3d748b03e31542f988ee3d10b147b7rdqs

This should open the jupyter lab interface now.