High Performance Computing

Introduction to Parallel Computing and the NCSU Linux Cluster

  • To be on the HPC mailing list

    Send mail to
    with the one line
    subscribe hpc
    in the body.
  • Spring 2016

    This free short course will meet for 4 hours total (over 2 sessions) on Thursday and Friday of spring break. (Thursday, March 10, 9:30 to 11:30 AM, and Friday, March 11, 9:30 to 11:30 AM). The 2 session short course will be held in ITTC C in the D.H. Hill library. Parking permits are not required during break. It's a bit tricky to find the ITTC C, so ask at the front desk.

    If you e-mail me (Gary Howell, gary_howell@ncsu.edu) in advance I can be sure there's enough space for you.

    Class notes can be downloaded at Intro to MPI. Sample codes can be downloaded from sample.tar.gz. See also Scientific Computing.

    Graduate students, postdocs, faculty and staff who are likely to use parallel computation in research projects or theses are particularly invited. Before class starts, students who do not already have a Blade Center account are encouraged to have their advisors request them so they can have a permanent account. Faculty can request accounts for themselves and for their students online from http://www.ncsu.edu/itd/hpc/About/Contact.php

    The NC State linux cluster is an IBM blade center with around ten thousand cores available for high performance computing. This short course introduces the use of the machines, starting with how to log on and submit jobs.

    A focus is on how to compile and link to MPI (Message Passing Interface), the standard library for message passing parallel computation. Calls to MPI are embedded in Fortran, C, or C++ codes, enabling many processors to work together.

    Session 1. How to log into the HPC machines and submit jobs. Why to use parallel computation. Some simple MPI commands and example programs. The last half of the time will be spent in getting an example code to run. A version of the lab is Lab 1

    Session 2. MPI Collective communications. These can be simple and efficient. Considerations in efficient parallel computation. Running some more codes. The lab is Lab 2

    Some additional materials online show how to use OpenMP to speed computations on multi-core computers. OpenMP parallelization is often fairly straightforward. OpenMP OpenMP2 OpenMP3
    On the blade center, most blades have two motherboards. RAM is more easily accesible from one or the other of the motherboards (NUMA .. or Non Uniform Memory Access). For OpenMP to scale to use both motherboards effectively, some more advanced tricks are needed. See for example Tutorial from HPC2012 by Georg Hager, Gerhard Wellein, and Jan Treibig, University of Erlangen-Nuremberg, Germany.

  • Spring 2015 CSC302 -- Numerical Analysis


    Some previous courses introduce parallel debugging, profiling, and OpenMP (shared memory programming). See Previous Courses [Previous courses and links to class notes]

Last modified: January 28 2016 16:31:46.