HPC
Contents |
This page is obsolete! Go to http://www.hpc.dtu.dk/
Compiling
MPI programs
How to compile MPI programs on the Sun machines?
To compile your MPI programs, you will have to do the following:
Include header files on the top of all subroutines that use MPI, i.e. for Fortran include 'mpif.h' and for C/C++ include 'mpi.h'. This is important for the definition of variables and constants that are used by the MPI system.
Compile and link with the following flags:
-I/opt/SUNWhpc/include -L/opt/SUNWhpc/lib -R/opt/SUNWhpc/lib -lmpi
These tell the compiler, linker and runtime environment where to look for include files, static libraries and runtime dynamic libraries. The command -lmpi loads the MPI routines.
There is a set of commands to call the compilers with the necessary options. These are the compiler front-ends mpf90, mpcc, etc. (get more information via the man pages man mpf90). Note that if you are using those, you will still have to supply the linker with the proper -L, -R, and -l options. We recommend therefore to include them "by hand" (put the options in your Makefile).
Which compilers should I use?
To get the best results, you should use the Sun compilers installed on the system. All the tools for program development are installed under /opt/SSNN/SUNWspro (where NN reflects the version number of Sun Studio), i.e. compilers, libraries, header files, debuggers, etc. The path /opt/SSNN/SUNWspro/bin and the relevant environment variables are usually set in the system profiles.
If you want to switch to another version of Sun Studio, you can do this using Modules. Please do not fix your PATH settings yourself, but use the modules command!
What about the GNU compilers?
There are several versions of the GNU C compiler accessible on the HPC machines. To activate a certain version of the GNU compilers, use the Modules package.
The Fortran compiler is also available.
Is there a development GUI?
Yes, there is a GUI for program development: Sun ONE Studio 4. It has a lot of facilities, e.g. project management, debugger, performance analysis, etc. You can start it from the command prompt with runide.sh.
Note: The old workshop IDE is no longer available!
Runtime
MPI Programs
Where is mpirun?
Unlike other MPI implementations, Sun has chosen to call the command to run MPI programs mprun instead of mpirun.
OpenMP programs
My OpenMP program fails for more than N threads!
If your program runs with a small number of threads, e.g. N threads, but suddenly fails for N+1 threads, try the following:
1. change the stacksize limit
- ksh/sh:
ulimit -s unlimited
- t/csh:
limit stacksize unlimited
This removes the overall stacksize limit for the program.
2. set the STACKSIZE environment variable
- ksh/sh:
STACKSIZE=8192; export STACKSIZE
- t/csh:
setenv STACKSIZE 8192
This assigns a stacksize of 8192 kbytes to every single thread.
Where should I store temporary files?
Each host has a local scratch filesystem, /space, where you can store temporary files while your program executes. It is possible to access this filesystem from the other hosts as well. Example:
/space on host_a is /gbar/host_a/home1/space from all hosts (incl. host_a)
Please
- create a directory with your username under /space, to avoid mixing up your files with other users;
- clean up after your job has finished.
Note:
- No backup is taken of /space and the system administrators will remove old files from time to time to clean up.
- Do not use /tmp for temporary file storage. /tmp is a filesystem that is cleaned up at every reboot, so your files will not survive a system crash/reboot.
When should I use the /space filesystem?
If your program has a lot of input/output operations to files on disk, you should definitely use the /space filesystem. Unlike your home directory, /space is always local to the machine your program executes on, i.e. you avoid access to your files via NFS (Network File System = slower I/O performance).
A typical job script in such a case could look like:
[...standard GridEngine options...]
STARTDIR=$PWD MYJOBDIR=/space/$USER/$JOB_NAME.$JOB_ID mkdir -p $MYJOBDIR cp program inputfiles $MYJOBDIR cd $MYJOBDIR ./program options mv outputfiles $STARTDIR && cd $STARTDIR && rm -rf $MYJOBDIR
The script above
- creates a directory MYJOBDIR under /space with a unique name, controlled by the jobname and the job id (requires that the script runs under GridEngine!),
- copies your executable program and the needed inputfiles to this directory,
- runs the program in that directory,
- moves your output back to the start directory and cleans up afterwards (omit the last rm command if you are not sure that the script really works).
GridEngine
Why does my job not start?
There might be different reasons why your jobs submitted to GridEngine will not start (status qw in the qstat output). You can get a more diagnostic output from qstat by using the -joption:
qstat -j job_id
You will get a list of all job options plus a list of reasons why your job cannot be dispatched to the different queues.
How can I specify on which host/queue my job will run?
Use the -q option when submitting your job, e.g. qsub -q host.queue job_script or put the option in the job_script:
#$ -q host.queue
You can specify several queues as well:
#$ -q host_a.queue, host_b.queue
Warning: Specifying the queues in that way limits the chances that your jobs get started, since this is a hard request. To make the condition weaker, i.e. you would prefer that your jobs run on host A but otherwise the job should be finished as soon as possible, you can use the -soft option:
#$ -soft -q host.queue
will dispatch the job to host.queue if possible, otherwise it will be dispatched to next free job slot available.
Why should I use $SGE_ROOT/bin/mprun when submitting MPI jobs?
To make it possible to control MPI jobs from GridEngine, you have to use a different mprun command to start your jobs. The command $SGE_ROOT/bin/mprun is a wrapper around the standard mprun command that sets up the environment needed to run MPI jobs successfully under GridEngine.
If you use the standard setup provided by the ~/.grouprc file, your $PATH is set up in the right way, i.e. $SGE_ROOT/bin/mprun is found before /opt/SUNWhpc/bin/mprun.
Why does qmon not work for me (Bus error)?
On some Xservers (especially XFree86), the 64-bit executable of qmon does not work. If you get a 'Bus error' message when starting qmon, try to use qmon32 instead (the 32-bit version of qmon).

