Getting Started with NC State's Intel/IBM Linux Cluster at MCNC
- Sam System Configuration
There are approximately 1000 dual Xeon nodes. Each node has two Intel Irwindale Xeon processors, four GB of memory, and a 40 GB disk. There are nodes available for interactive code development. The compute nodes are managed by the LSF queuing system and are not for access except through LSF (accounts directly accessing a compute node are subject to immediate termination). Compute nodes are interconnected with gigabit Ethernet.
Logins for the sam cluster are handled by the interactive nodes.
- Logging onto the cluster
SSH access is supported to the interactive
nodes which are reached using hostname
loginhpc.dcs.mcnc.orgAuthentication uses Unity IDs and passwords.-
Free SSH clients are available from various sources.
Links to some commonly used versions are included here:
- Windows
- Unix, Linux
- File Systems
AFS files are not available from the
cluster. Users have a home directory that is
shared by all the sam cluster nodes. This
is not the same home directory as is used
on the BladeCenter Linux cluster (henry2).
The
/usr/localfile system is also shared by all nodes. Each node currently has its own local/scratchfile system that is available to all users and a shared scratch parallel file system/gpfs_share(this is also not the same file system available on henry2 cluster).Currently no backups are being performed on any file system mounted on the sam cluster!
- Compiling
There is currently one supported compiler, Portland Group, available for use on the sam cluster. While GNU compilers are installed their use is strongly discouraged and is not supported.
- Portland Group Compilers
To use the 64-bit Portland Group compilers it is necessary
to properly configure some environment variables
and paths. For
csh/tcshshell users a shortcut is available by using an alias which has been created -add.add pgi
Will configure the environment to use the Portland Group compilers.Once these have been set the Portland Group compilers may be invoked with the
pgcc,pgcpp,pgf77,pgf90, andpghpfcommands for the C, C++, Fortran77, Fortran90, and High Performance Fortran compilers respectively.Parallel programs compiled with the Portland Group compilers should be linked with the Portland Group MPICH libaries.
Having added the pgi envirnomment by
add pgi
the following command line line would compile an MPI Fortran 90 code with a high level of optimization:pgf90 -o exec -fastsse -Mmpi exec.f
- Portland Group Compilers
To use the 64-bit Portland Group compilers it is necessary
to properly configure some environment variables
and paths. For
- Running Jobs
Access to the compute nodes is managed by LSF.
All tasks for the compute nodes should be
submitted to LSF.
The following steps are used to submit jobs to LSF:
- Create a script file containing the commands to be executed for your job:
#BSUB -o standard_output #BSUB -e standard_error cp input /share/myuserid/input cd /share/myuserid ./job.exe < input cp output /home/myuserid
- Use the
bsubcommand to submit the script to the batch system. In the following example two hours of run time are requested:bsub -W 2:00 < script.csh
- The
bjobscommand can be used to monitor the progress of a job - The
-eand-ooptions specify the files for standard error and standard output respectively. If these are not specified the standard output and standard error will be sent by email to the account submitting the job. - The
bpeekcommand can be used to view standard output and standard error for a running job. - The
bkillcommand can be used to remove a job from LSF (regardless of current job status).
For parallel jobs it is necessary for LSF to interface with the mpirun command to pass host and process information. To enable the LSF/MPI interface a script
mpirun.lsfhas been provided in the LSF bin directory. The following batch script will run a parallel job, note that the number of MPI tasks will match the number of processors requested from LSF.#BSUB -n 4 #BSUB -W 60 #BSUB -J job #BSUB -o standard_output.%J #BSUB -e standard_error.%J #BSUB -a mpichp4 mpiexec ./parjob.exe
Alternatively, replace the "mpiexec" line by
mpirun.lsf /whateverthefullpathis/parjob.exe
The
#BSUBlines in the script pass options to the LSF bsub command. The-noption specifies the number of processors,-Wspecifies the run limit in minutes,-Jprovides a meaningful name for the job,-ospecifies a file to hold standard output,-especifies a file to hold standard error output, and-a mpichp4identifies to LSF that the job will use MPI and the type of MPI being used.The script can be submitted to LSF for execution using the command:
bsub < script.csh
LSF writes some intermediate files in the user's home directory while the job is running. If the disk quota has been exceeded, then the batch job will fail, often without any meaningful error message.
- Create a script file containing the commands to be executed for your job:
Last modified: April 19 2013 11:37:35.