Skip title Accessibility statement: we seek to make the HPC web pages accessible to all users. If you encounter accessibility issues with HPC web pages please send a description of the problem by email to eric_sills@ncsu.edu - thank you.

High Performance and Grid Computing
   
Skip menu side bar
Home
About

OpNews

Help/Accounts

Staff

Partners

User Projects


Services

Hardware

Software

Grid

Monitor


HowTo/FAQ

Docs & Pubs

Courses

Other Resources


 Getting Started with the IBM Blade Center Linux Cluster at NC State ...
  • Login nodes are mainly for compiling code, copying and editing files, and submitting jobs to LSF to run as batch.

    The login nodes should not be used for interactive jobs that take any significant fraction of system resources. The usual way to run CPU intensive codes is to submit them as batch jobs to LSF, which schedules them for execution on computational nodes. Example LSF job submission files can be found in Intel Compilers.

    Nevertheless, it is often necessary to use interactive GUI based serial pre and post processors for data resident in the HPC environment.

    Interactive computing in the HPC environment should be performed by requesting a VCL HPC service. To request a VCL node with the HPC environment, go to the web page http://vcl.ncsu.edu.

    Click on "Make a VCL Reservation"

    From the list of environments, select "HPC(Redhat Linux)"

    When a node is available, you will receive a message detailing how to log in. You can have exclusive use of the node for four hours (actually can be extended a few hours if the system is not busy). If you have problems getting an HPC VCL node, e-mail gary_howell@ncsu.edu to make sure to add you to the the list of eligible users.

    • Henry2 System Configuration

      There are currently 131 2.8 GHz or 3.0 GHz 2-way nodes. Each node has two Xeon processors, four GB of memory, and a 40 GB disk. There are an additional two nodes available for code development and debugging. The blade center nodes are managed by the LSF queuing system and are not for access except through LSF. Logins for the cluster are handled by the login nodes (login.hpc.ncsu.edu).
      Additional information on the university Linux cluster configuration is available in http://hpc.ncsu.edu/Documents/hpc_cluster_config.pdf

    • Logging onto the cluster

      SSH access is supported to the login nodes (login.hpc.ncsu.edu). Logins are enabled using Unity IDs and passwords.
        Free SSH clients are available from various sources. Links to some commonly used versions are included here:
      • Windows
      • Unix, Linux
    • File Systems

      AFS files are not available from the cluster. Users have a home directory that is shared by all the cluster nodes. Also, the /usr/local file system is shared by all nodes. Each node currently has its own /scratch file system that is available to all users. Two shared scratch file systems /share and /share3 are also available on each node. Additionally, from the login nodes the HPC storage system, /ncsu/volume1 and /ncsu/volume2, is available for storage in excess of what can be accomodated in /home and these file systems are also available from the IBM POWER5 system.

      User files in /home, /ncsu/volume1, and /ncsu/volume2 are backed up daily. A single backup version is maintained for each file. User files in all other file systems are not backed up.

      Important files should never be placed on storage that is not backed up unless another copy of the file exists in another location.

      HPC projects are allocated 100GB of storage in one of the hpc storage systems (volume1 or volume2). Additional backed up space in these file systems can be purchased or leased. Additional information about storage on HPC resources is available from http://hpc.ncsu.edu/Documents/GettingStartedstorage.php"

    • Compiling

      There are three compiler flavors available on the cluster: 1) the standard gnu compilers supplied with linux, 2) the Intel compilers, and 3) the Portland Group compilers.

      The default gnu compilers are good for compiling utility programs, but are not as appropriate for computationally intensive applications. Overall the best performance has been observed using the Intel compilers. Moreover, good debuggers and profilers are available with the Intel compilers.

      See A note on compiling executables with large (> ~1 GB) memory requirements

      See Serial Compilers on the Blade Center for some information on how to compile serials codes on the blade center. This can be useful if you want code to run on some other Linux box. Long serial jobs to run on the Blade Center should be submitted to the LSF queue. (Running computationally intensive jobs on the head node can lock it up, causing reboots, inconveniencing other users who lose their current work . . . so such jobs are killed as found ).

      • GNU Compilers
        The gnu compilers are available in the default path and are invoked with the cc and f77 commands for the C/C++ and Fortran77 compilers respectively. For parallel codes the MPICH library compiled with the gnu compilers is available in /usr/local/gnu/mpich-rhel3/mpich-1.2.6-3.2.3/lib. To set environmental variables to use the gnu compilers, type
        add gnu
        

        The following commands compiled and linked a simple parallel program

        mpif77 -c   ring.f
        g77 -o  rring ring.o -L/usr/local/gnu/mpich-rhel3/mpich-1.2.6-3.2.3/32/lib /
         -lfmpich  -lmpich -L/usr/local/gnu/gcc-lib/i386-redhat-linux/3.2.3 -g2c
        

        If the file named brring contains

        #! /bin/csh
        #BSUB -W 10
        #BSUB -n 4
        #BSUB -o /share/foouser/ring.out.%J
        #BSUB -e /share/foouser/ring.err.%J
        #BSUB -J ring
        mpiexec  ./rring
        
        Then executing the command (from the same window from which /usr/local/gnu/mpich-rhel3/gnu-rhel3.csh was sourced)
        bsub < brring
        
        (having changed foouser to your own user name) submits the code for execution. The -W 10 line sets a job limit of ten minutes, -n 4 asks for 4 processors. Since only a few processors and a short time are asked for, the job will be submitted to the debug queue, and hence return quickly. The stardard output goes to the file /share/foouser.ring.out.xxxxxx where the xxxxxx is the LSF job ID. Similarly /share/foouser.ring.err is (due to the -e flag) the standard error.

        After job submission, a user can track the job progress by entering

        bhist
        
        or
        bhist -l
        
        and kill the job by entering
        bkill xxxxxx
        
        where xxxxx is the LSF job ID returned by bhist. If the job has started running, standard output and error can be accessed by
        bpeek
        

        Parallel programmers are strongly encouraged to use the Intel or Portland Group compilers to generate more efficient code. Those constructing code for others to use should consider that code compiled with the Intel compilers is likely to be portable to other platforms.

      • Intel Compilers
        To use the Intel compilers it is necessary to properly configure some environment variables and paths. This is easily accomplished by sourcing /usr/local/apps/env/intel.csh.

        Once one of these files have been sourced, the Intel compilers with links to the mpich libraries may be invoked with the mpif77, mpif90, mpicc and mpiCC commands for the Fortran77/90 and C/C++ compilers respectively.

        As a convenience an alias - add - has been created for csh/tcsh users to set up the environment for various software packages. To use the Intel compilers the command

        add intel
        
        will set the necessary environment variables.

        Parallel programs compiled with the Intel compilers should be linked with the MPICH libraries located under /usr/local/intel/mpich.

        The following command line would compile a Fortran MPI code with a high level of optimization:

        mpif90 -o rring -O3 -tpp7 -xW -static ring2.f 
        
        At this time (May 2005), Intel compiled codes require the -static (specifying use of static .a libraries ) flag for successful execution. Similar scripts are available for C (mpicc), C++ (mpiCC), and Fortran77 (mpif77).

        If the file named brring contains

        #! /bin/csh
        #BSUB -W 10
        #BSUB -n 4
        #BSUB -o /share/foouser/ring.out.%J
        #BSUB -e /share/foouser/ring.err.%J
        #BSUB -J ring
        mpiexec  ./rring
        
        Then executing the command (from the same window from which /usr/local/gnu/mpich-rhel3/gnu-rhel3.csh was sourced)
        bsub < brring
        
        (having changed foouser to your own user name) submits the code for execution. The discussion above under the gnu compilers shows what some of the flags mean.
      • Portland Group Compilers
        To use the Portland Group compilers it is necessary to properly configure some environment variables and paths.

        A shortcut is available for csh/tcsh users by using an alias which has been created - add.

        add pgi
        
        Will configure the environment to use the Portland Group compilers. The same job submission script as in the gnu and Intel examples also works for the Portland group compiled code.

        It is not recommended to use the Intel and Portland Group compilers during the same login session.

        Once these have been set the Portland Group compilers may be invoked with the pgcc, pgCC, pgf77, pgf90, and pghpf commands for the C, C++, Fortran77, Fortran90, and High Performance Fortran compilers respectively.

        Parallel programs compiled with the Portland Group compilers should be linked with the MPICH libaries.

        Having added the pgi envirnomment by
        add pgi
        
        the following command line line would compile an MPI Fortran 90 code with a high level of optimization:
        mpif90 -o rring -fastsse ring2.f 
        

    • Running Jobs

      The login nodes are shared by all users. Running computationally intensive jobs on the login nodes can cause them to stall and need to be rebooted. Moral: don't hog the login nodes. If you do need to extensively use GUI based applications -- for example to set up your batch jobs or analyze data resulting from runs of batch jobs, then one good way is to use the VCL facility, selecting an HPC image. Users should also refrain from running more than one sftp or scp session at a time.

      Running computationally intensive jobs on the blade center (anything other than a compilation that requires more than a minute or so to run) is accomplished by using LSF to submit batch tasks to the compute nodes.

      All tasks for the compute nodes should be submitted to LSF. The following steps are used to submit jobs to LSF:

      • Create a script file containing the commands to be executed for your job:
        #BSUB -o standard_output
        #BSUB -e standard_error
        cp input /share/myuserid/input
        cd /share/myuserid
        ./job.exe < input
        cp output /home/myuserid
        
        
        Another example script was given above.

      • Use the bsub command to submit the script to the batch system. In the following example two hours of run time are requested:
        bsub -W 2:00 < script.csh
        
      • The bjobs command can be used to monitor the progress of a job
      • The -e and -o options specify the files for standard error and standard output respectively. If these are not specified the standard output and standard error will be sent by email to the account submitting the job.
      • The bpeek command can be used to view standard output and standard error for a running job.
      • The bkill command can be used to remove a job from LSF (regardless of current job status).

      For parallel jobs it is necessary for LSF to interface with the mpirun command to pass host information. To simplify this process an interface script mpiexec has been provided in the LSF bin directory. The following batch script will run a parallel job, note that the number of tasks will match the number of processors requested from LSF. The path set when bsub is invoked must include the appropriate mpirun command.

      #! /bin/csh
      #BSUB -o standard_output
      #BSUB -e standard_error
      mpiexec ./parjob.exe
      

      To submit a parallel job use the -n option to the bsub command to specify the number of processors to be used.

      There are a number of queues currently configured. In general the best queue will be selected automatically without the user specifing a queue to the bsub command. In some cases LSF may override user queue choices and assign jobs to a more appropriate queue.

      There is a queue that will schedule jobs on any of the blades and accepts jobs using up to 64 processors. The serial job queue will schedule jobs only on selected blades. The single_chassis queue will schedule jobs only on blades that are located within the same chassis. Each chassis holds 14 blades so jobs accepted by the single_chassis queue are limited to a maximum of 28 processors.

      A note on LSF job scheduling

      LSF writes some intermediate files in the user's home directory while the job is running. If the disk quota has been exceeded, then the batch job will fail, often without any meaningful error message.

Last modified: August 01 2006 15:40:18.
Copyright © 2003-2007 by NC State University and others, All Rights Reserved.
HPC & Grid (Version 1.4 / Site access count: 754060) - Site/Content Notice

Site contact: Eric Sills, E-mail: eric_sills at ncsu dot edu , Tel: 919-513-0324, Fax: 919-513-1893, HPC and Grid Operations, Information Technology Division, Box 7109, North Carolina State University, Raleigh, NC27695-7914, USA