Wiss. Rechnen » MATLAB
 

MATLAB is a software for mathematical calculations and their visualization. The name MATLAB derives from matrix laboratory and shows the origin of the software, matrix operations. MATLAB was developed by the company MathWorks.

On the cluster, MATLAB is installed in the versions R2019a and R2018a. The default version is R2019a, it is available without loading any modules. In order to use R2018a, you have to enter module load matlab/2018a.

On this page you can find information on how to start MATLAB, both the graphical user interface and the batch (command line) mode. There are also instructions on queuing MATLAB jobs and using the parallel computing features of MATLAB, both for single-node and multi-node jobs.

Test

Literature

Documentation for all features with a lot of examples can be found in the MATLAB documentation center.

Tutorials in PDF form are also available (albeit in German) on the internet from RWTH Aachen as well as a script on numerical computations with MATLAB by the University of Siegen.

Licensing

The base licensing agreement between Mathworks and the University of Siegen only allows using MATLAB in the same way as on a desktop PC. That means that a user may not run a MATLAB program on multiple nodes at the same time but rather they are limited to a single node. The installed Parallel Computing Toolbox allows distributing a MATLAB computation between multiple MATLAB processes (so-called workers) on a node.

However, ZIMT has additionally licensed the MATLAB Distributed Computing Server which allows running a computation on multiple nodes. This license is for 16 workers.

Starting MATLAB

The MATLAB graphical user interface (GUI) is started by simply entering the command

$ matlab

in the console. MATLAB can also be run in batch mode, which means that a chain of commands (for example a script) will be executed one after the other. For batch mode, -nodisplay is added. Individual MATLAB scripts can be run by specifying the option -r and the script name without the suffix .m.

$ matlab -nodisplay -r calculate

In this example, a script file named calculate.m is executed. It needs to be available in the current folder.

Tip: ZIMT has set up MATLAB R2019a such that the folder ~/.matlab is in its search path. This means that if you place a file named ~/.matlab/startup.m in that folder, you can specify custom settings that will then be run whenever MATLAB is started.

Batch jobs with MATLAB

The following example shows how to run MATLAB programs via the scheduler SLURM. In order to run the example yourself you need to create a subfolder MATLAB in your home directory and put the example MATLAB script and job script in that folder.

The MATLAB script sin_plot.m, which you can download here, calculates a sine curve, creates a graph with the result and saves the graph as a PNG image file:

A=sin([1:1024]2pi/1024)
h=figure; plot(A)
print(h, '-dpng', '~/MATLAB/sin.png')
exit

The file sin_batch.m, downloadable here, is the corresponding job script which will run MATLAB in batch mode:

!/bin/bash
SBATCH --time=00:10:00
SBATCH --partition=short
SBATCH --ntasks-per-node=1
matlab -singleCompThread -nodisplay -r sin_plot

More information on job scripts can be found here

The job is queued like any other job with the command:

$ sbatch sin_batch.sh

When the job has completed successfully, the files ~/MATLAB/my.output and ~/MATLAB/sin.png exist.

Caution: MATLAB computations on the cluster are to be run on the compute nodes only, via SLURM. Do not run CPU-intensive computations on the login nodes as you will slow them down for everyone! We reserve the right to kill such processes without prior warning.

Parallel Computations with the Parallel Toolbox

The Parallel Toolbox allows for running a MATLAB computation on multiple processes, so-called workers, concurrently. All workers need to be run on a single node. On the HoRUS cluster, dividing a computation between multiple nodes is also possible but requires additional configuration, see below.

Using the MATLAB Distributed Computing Server

The MATLAB Distributed Computing Server (MDCS) enables MATLAB computations on multiple nodes. Caution: the MDCS is currently only available in MATLAB R2018a.

ZIMT has configured the MDCS such that it uses SLURM for scheduling. That way, a job can be queued from within MATLAB and will run like any other job on the cluster. However, you first need to import the cluster profile for HoRUS into MATLAB.

Importing the cluster profile

The MDCS uses so-called cluster profiles in which the necessary information about the cluster is preconfigured. For that purpose, ZIMT provides the profile horus_m2019.settings which you can download here and then need to import into MATLAB as follows:

  1. In the MATLAB GUI, click on Parallelin the menu Home and select the item Manage Cluster Profiles
  2. The Cluster Profile Manager will open. To import the downloaded settings file, click on Add -> Import

Alternatively, you can also import a cluster profile using only the MATLAB command line. The command to import is done by:

profile_master = parallel.importProfile('horus_m2019.settings');

And you can set the profile as default with:

parallel.defaultClusterProfile(profile_master);

Creating MATLAB scripts for MDCS computations

When creating MDCS scripts, two things need to be considered:

  1. At the beginning of the MATLAB script to be run, a so-called parallel pool, meaning a group of workers, need to be started with the correct cluster profile. This can be done in two ways. Either the pool is created explicitly as described here or it is created MATLAB automatically (using the default settings) when a parallel operation is called, see below. The command parpool starts a pool and provides the parallel functionality for the commands parfor and spmd. Use the imported profile (its name inside MATLAB is HorUS) to use the scheduler SLURM. You can also use parpool to process jobs and MATLAB commands via workers interactively. Here are two examples for creating a parallel pool:
    % Example 1
    poolobj = parpool('HorUS', 2, 'AttachedFiles',{'mod1.m', 'mod2.m'}) 
    Starting parallel pool (parpool) using the 'HorUS' profile ... 
    
    % Example 2
    poolobj = parpool('HorUS', 3) 
    Starting parallel pool (parpool) using the 'HorUS' profile ...

    End parallel processing (deleting the pool):

    delete(poolobj)
  2. In the actual MATLAB computation, parallel operations need to actually be called. This means at least one instance of at least one of the already mentioned commands parfor and spmd.

In order to run such a MATLAB script you need to execute it inside MATLAB like any other script. Either you start it with Editor -> Run from the MATLAB GUI or you run MATLAB in batch mode with your script. Note: you do not need to create a separate job script because the MDCS will queue the job for you.

Jobs queued in such a way are treated identically to all others. That means that you can monitor them using squeue, that you may need to wait until the job starts, and that you can stop it with scancel.

Aktualisiert um 17:54 am 12. August 2018 von Jan Philipp Stephan