ZIMT operates a system for High-Throughput Computing (HTC), which is mostly integrated into the OMNI cluster and appears as the HTC partition there. The partition consists of 4 login nodes and 41 compute nodes, which are physically located in special compact blades in an HPE Moonshot 1500 chassis inside the NDC. This page describes how to use them.
The term High-Throughput Computing (HTC) means the computation of a large number of small compute jobs which are usually independent of each other (trivially/embarrassingly parallel). This is in contrast to High-Performance Computing (HPC), which usually means larger jobs with highly interconnected subtasks.
Access and Login
If you have OMNI access, you can also log into the Moonshot nodes.
There are four login nodes with the designations
htc002 respectively, just like on OMNI there is an alias
htc which will bring you onto one of the four login nodes. You should use this alias whenever possible, since the load balancer will always bring you onto the least busy login node. Connection happens via
ssh just like on OMNI, by appending
.zimt.uni-siegen.de to the node name or the alias.
Caution: logging in via password is only possible from within the University network or the Uni VPN. If you would like to log in from the outside, you need to set up a password-less access via a public/private key pair first. Since your home directory is the same on OMNI and the Moonshot system, you only need to do this once.
The remaining nodes
htc007 are compute nodes and are not directly accessible from the outside.
In principle, all modules that are installed on OMNI are also available on the HTC nodes. However, due to the different CPU architectures it is not guaranteed that a module works just because it is available.
Caution: ZIMT has not tested all OMNI modules on the HTC nodes and you should always conduct your own tests with a given module before you use it productively.
You can run compute jobs on the nodes
htc-node041 in the same way as on OMNI: by queuing SLURM jobs in the
htc queue. Job and nodes status in the
htc queue can be monitored as usual with
sinfo, both from OMNI and from the HTC nodes. The individual SLURM commands are described here. The default walltime in the HTC queue is 12 hours, the maximum walltime is 24 hours.
Caution: if you do not specify a queue (queue=partition in SLURM terminology), the job will be put into the default queue (
short) and will therefore run on OMNI and not on the HTC nodes. You have to include the following line:
in your job script (or specify the
htc partition when calling
sbatch) if you want your job to run on the HTC nodes.
Can HTC jobs be queued from OMNI and vice versa?
Partially yes. You can queue jobs as long as the differences in the CPU architecture make no difference. In particular, ZIMT does not currently support cross-compiling.
For example, it should be easily possible to queue a MATLAB job from OMNI into the HTC queue, because MATLAB is installed on both. However, if you want to compile a C or Fortran program, this has to happen on the HTC front end (