MPI across nodes#

When the number of cores present on a single node are not enough, you may use the power of MPI to harness cores on multiple nodes.

I recommend you load :
- /compiler/openmpi-4.1.3

The hostfile provided by

The rough structure of your job script should look as follows then :

#! /bin/bash
#PBS -N yourMPIJob
#PBS -o out.log
#PBS -e err.log
#PBS -l nodes=2:ppn=104
#PBS -q cpu

cd $PBS_O_WORKDIR
module load compiler/4.1.3
cat $PBS_NODEFILE |uniq > host

mpirun --mca btl tcp,self -npernode 50 -x LD_LIBRARY_PATH=$LD_LIBRARY_PATH --hostfile ./host -np 100 {YOUR_PROGRAM}

The host file#

This is arguably the single most important part.
When your job executes, $PBS_NODEFILE is an set automatically by the shell to include the hostnames of the nodes allocated to you.
The script fetches the contents of this file, gets the unique hostnames, and makes a hosts file inside your working directory that mpirun uses.

Setting MCA Parameters#

For additional details, read the following reference
If you don't understand what this is doing, leave it at --mca btl tcp,self

For example

mpirun -mca btl tcp,self -np 1 foo : Tells Open MPI to use the "tcp" and "self" BTLs, and to run a single copy of "foo" an allocated node.

The npernode Parameter#

-npernode, --npernode <#pernode> : On each node, launch this many processes. (deprecated in favor of --map-by ppr:n:node)

Caveat#

Please donot request more resouces from the scheduler than what your program needs.