HOWTO use AmpTools on the JLab farm with MPI
Load MPI module
On the ifarm you can load the MPI module with
module load mpi/openmpi-4.0.1
providing the binaries below to compile (mpicxx) and run (mpirun) MPI commands
which mpicxx which mpirun
AmpTools Compilation MPI
This example was done in csh on ifarm1901
1) Download latest AmpTools release
git clone email@example.com:mashephe/AmpTools.git
2) Set AMPTOOLS directory
setenv AMPTOOLS_HOME $PWD/AmpTools/ setenv AMPTOOLS $AMPTOOLS_HOME/AmpTools/
3) Put root-config in your path (assumes ROOTSYS set by some other setup script)
setenv PATH $ROOTSYS/bin:$PATH
4) Build main AmpTools library with MPI support (temporary branch to support openmpi version 4 on ifarm)
cd $AMPTOOLS/AmpTools make MPI=1
Fitter Compilation with MPI
The only MPI dependent part of halld_sim is fitMPI.cc is an optional build for MPI fits, analogous to the usual fit.cc without MPI. You can build the fitMPI executable with the following commands (requires git pull of the halld_sim master branch after 1/13/22)
cd $HALLD_SIM_HOME/src/programs/AmplitudeAnalysis/fitMPI/ scons -u install
Performing Fits Interactively
The fitMPI executable is run with mpirun
mpirun N fitMPI -c YOURCONFIG.cfg
where N is the number of parallel processes to use in the fit and YOURCONFIG.cfg is your usual config file. Note: additional command line parameters can be used as well, as needed.
Submitting Batch Jobs
To submit an MPI enabled job to the JLab farm we can use the slurm scheduler system directly. One way to do this, is using a script that contains some slurm commands and tells the scheduler about how to run your job.
Example for a slurm submit script:
#!/bin/bash -l #SBATCH -A halld #SBATCH -p ifarm #SBATCH -t 1-00:00:00 #SBATCH -J FITNAME #SBATCH --mail-type=begin # send email when job begins #SBATCH --mail-type=end # send email when job ends #SBATCH --mail-type=fail # send email if job fails #SBATCH --mail-user=USER@jlab.org #SBATCH --ntasks=100 /usr/local/openmpi/openmpi-4.0.1/bin/mpirun --mca btl_openib_allow_ib 1 fitMPI -c YOURCONFIG.cfg -s YOURCONFIG_params.dat -m 50000 !>& YOURCONFIG.log
You can copy and paste the above lines to a text file called for example submit.sh. Remember to replace FITNAME, USER and YOURCONFIG.
This script is able to send an email to your jlab email address when the job starts, fails and ends. If you do not want to make use of this option, remove the corresponding lines from the submit script.
In this particular example, 100 processes are started. These will be spawned by the slurm scheduler on arbitrary nodes as they are available on the farm.
Furthermore, it submits the job to the slurm partition called ifarm. If you want to submit to a different partition, modify the corresponding option in the script.
The last line of the script is the actual command that is executed and requires that the fitMPI executable is compiled and running within your environment. There are some options passed to the fitMPI program, like saving the final parameters in a text file (-s option) or setting explicitly the number of Minuit calls to 50000 using the option -m. These options should be removed or modified if necessary.
When you have adjusted the options to your needs, submit the job to the batch system using the command
After submission, you can use standard slurm commands to check and control your fit. To check the status of all your jobs:
squeue -u USER
In case something goes wrong, you can terminate all your jobs with
scancel -u USER
or a specific job with