Running Jobs
Access to the compute nodes is managed by LSF.
All tasks for the compute nodes should be
submitted to LSF.
The following steps are used to submit jobs
to LSF:
For parallel jobs it is necessary for LSF to interface with the
mpirun command to pass host and process information. To enable the
LSF/MPI interface a script mpirun.lsf has been provided in the
LSF bin directory. The following batch script will
run a parallel job, note that the number of MPI tasks will match the
number of processors requested from LSF.
#BSUB -n 4
#BSUB -W 60
#BSUB -J job
#BSUB -o standard_output.%J
#BSUB -e standard_error.%J
#BSUB -a mpichp4
mpiexec ./parjob.exe
Alternatively, replace the "mpiexec" line by
mpirun.lsf /whateverthefullpathis/parjob.exe
The #BSUB lines in the script pass options to the LSF bsub
command. The -n option specifies the number of processors,
-W specifies the run limit in minutes, -J provides
a meaningful name for the job, -o specifies a file to hold
standard output, -e specifies a file to hold standard error
output, and -a mpichp4 identifies to LSF that the job will
use MPI and the type of MPI being used.
The script can be submitted to LSF for execution using the command:
bsub < script.csh
LSF writes some intermediate files in the user's home
directory while the job is running. If the disk quota
has been exceeded, then the batch job will fail, often
without any meaningful error message.