About the HPC#

What is an HPC Cluster#

High Performance Computing or HPC is an art of having a collection of powerful processing nodes joined together as a cluster (or groups of them) to solve huge massive datasets or complex multi-dimentional mathematical problems in a parallel way as fast as possible. So HPC is all about the Computation.

You might also hear the term Supercomputer which is a synonym of the HPC cluster (or groups of them). The name referes more to the hardware point of view and that of it's power which is usually calculated as a floating point unit called flops (like teraflops,petaflops, ...).

Tip

As an example, you can think of a student who has a big line of code (written in more than thousands of lines) and if they tend to run their code into their personal laptop, how many years does it take to finish the processing, seeing the final result? So instead if they pass down their code into the supercomputer, that would give them the results in a very short period of time.

The Architecture#

alt

Typically a basic cluster is made up of these core components:

A login node (head node)
A variety of compute nodes
A Distributed Shared Storage
GPU Servers (to take advantage of having massive graphical calculations)
An Infiniband switch for interconnecting nodes
A job Scheduler and a Queueing System
A transfer node
A post processing node

How does it work?#

Users login to the head node and upload their input files. then submit their jobs to the scheduler and being waited on the queue. The scheduler then starts dispatching their job to all other compute nodes the user is allowed to access.