Portland Group Compilers
To use the Portland Group compilers it is necessary
to properly configure some environment
variables and paths. Portland group compilers are
"user friendly" and produce efficient executables.
There is run-time check for a valid license, so PGI
compiled codes are valid only on machines with access to
PGI licenses. Some useful 64 bit numeric libraries for pgi
compilers can be found in /usr/local/apps/acml/acml4.3.0
As a convenience an alias - add - has been created
for tcsh users to set up the environment for various
software packages. To use the Intel compilers the command
add pgi
will set the necessary environment variables on either 32 bit or
64 bit login nodes. From 64 bit login nodes, the command
add pgi64_hydra
enables use of mpich2 hydra, which avoids some run time error
messages (net_send ..) associated with the mpich1 libraries invoked
by "add pgi".
Currently, either the command 'add pgi' or 'add pgi64_hydra' will set up the
environment for version
10.5 of the PGI compilers. 'add pgi' is an
alias for `source /usr/local/apps/env/pgi.csh'. Other possible choices of PGI
compilers can be found by listing the files in that same directory,
i.e., 'ls /usr/local/apps/env' and then trying
the equivalent 'add' command.
Once one of these files have been executed, the PGI
compilers may be invoked with the pgcc, pgCC,
pgf77, and pgf90 commands for the C, C++,
Fortran77 and Fortran90compilers respectively.
Compiling a Serial Program
The following command would generate an executable named
'exec' from the Fortran source code file named 'code.f'
with a moderately high level of optimization.
pgf90 -fastsse -o exec code.f
Compiling a Multi-Core Program
PGI compilers can generate shared memory
parallel executables. Older henry2 Linux cluster nodes
have only two processors, so shared
memory parallelization provide limited benefits.
Newer nodes have 4 and 8 cores. Recently we've
added some AMD blades with 16 cores. Moreover,
most desktops and even laptops have multiple cores,
so that program execution can be speeded by adapting
codes to run using the shared memory.
OpenMP directives in C and Fortran PGI compiled codes can be
enabled with a -mp flag. The following command line
would compile an OpenMP flag with a high level of
optimizations
pgf90 -fastsse -tp x64 -mp=numa -o exec code.f
The -tp x64 flag optimises for either x86_64 Intel processors
or for AMD Opteron processors. The command assumes compilation
from login64 (not login01 or login02).
The numeric libraries in /usr/local/apps/acml/acml4.3.0/pgi64_mp
and /usr/local/apps/acml/acml4.3.0/pgi64_mp_int64 are multi-threaded,
so can take advantage of multi-core architectures. the "int64" libraries
require usage of 64 bit integers, useful in avoiding segmentation faults
with very large matrices.
Compiling a Parallel Program
Distributed memory parallel programs written
with MPI function calls are currently the most
appropriate programming model to achieve good
performance from commodity clusters such as
henry2. The mpich1 libraries we have used for several
years are no longer maintained by the developers.
The mpich2 library accessed by 'add pgi_mpich2_hydra-105'
works with longer messages but entails a slightly different
job submission script using mpiexec_hydra instead of mpiexec.
MPI parallel programs compiled with the PGI compilers
should use the mpif77, mpif90, mpicc, or mpiCC
commands to link with the MPICH libraries.
The following command line would compile an MPI
code with a high level of optimization:
mpif90 -o exec -fastsse code.f
The MPICH MPI library is used when the mpi* commands
are invoked with the PGI compiler environments.
|