- A Sample make file for the intel ifc compiler.
Add the Intel environment by the command line
>add intel
The following make file would link to the
BLAS library.
- LIBDIR = /usr/local/intel/mkl721/lib/32
- LIBS = -L${LIBDIR} -lmkl_ia32 -lmkl_lapack -lguide -lpthread
- FC = /usr/local/intel/compiler70/ia32/bin/ifc
- FFLAGS = -O3 -tpp7 -static
- foo:
- $(FC) $(FFLAGS) -c foo.f
- $(FC) $(FFLAGS) -o foo foo.o ${LIBS}
The -static flag
allows matrices using up to 231 - 1 bytes (one byte less than 2 Gigabytes).
Using the -static flag takes the memory space that would otherwise
be used for .so shared libraries.
If the above makefile is foomake, you could execute it by the command line
(making sure the $(FC) lines start with tabs).
>make -f foomake
The web page Intel support has more information on linking to the Intel math kernel library.
- Sample make files for the Portland Group pgf77 compiler.
Similarly, to link the BLAS and LAPACK libraries when using the PGI Fortran compiler, first add the pgi environment by
>add pgi
(if you've been using the Intel compiler environment, then log out and back in before the "add pgi"). Then the following make file should work.
- LIBDIR = /usr/local/gnu/blas/ATLAS_rhel3-32/lib/rhel3-32
- LIBS = -L${LIBDIR1} -lcblas -lf77blas -latlas
- FC = /usr/local/pgi/linux86/6.0/bin/pgf77
- foo:
- $(FC) -c foo.f
- $(FC) -o foo foo.o ${LIBS}
The link here is to the ATLAS blas library, which is not as fast as the
Intel library, but does appear to be compatible with the pgi compiler.
To allow larger matrices, use the -Bstatic flag. As with
the Intel compiler, the -Bstatic flag can not be used with an executable that
links to shared .so libraries.
- FFLAGS = -c -Bstatic -fastsse
- LFLAGS =
- LIBDIR2 = /usr/local/gnu/blas/ATLAS_rhel3-32/lib/rhel3-32
- LIBS = -L${LIBDIR2} -lcblas -lf77blas -latlas -lg2c /usr/local/pgi/linux86/6.0/lib/liblapack.a
- foo:
- $(FC) -c foo.f
- $(FC) -o foo foo.o ${LIBS}
- A Sample make file for the gnu g77 compiler.
The gnu environment is the default.
The following make file would link to the
ATLAS BLAS library.
- FC = /usr/bin/g77
- CC = /usr/bin/gcc
- LIBDIR = /usr/local/gnu/blas/ATLAS_rhel3-32/lib/rhel3-32
- LIBS = -L${LIBDIR} -llapack -lcblas -lf77blas -latlas -lg2c
- FFLAGS = -static
- foo:
- $(FC) $(FFLAGS) -c foo.f
- $(FC) $(FFLAGS) -o foo foo.o ${LIBS}
The -static flag
allows matrices larger than the stack size. For example, if the
-static flag is not used, then dimensioning 3 1K by 1K double
precision matrices requiring a total of 24 MBytes of storage would
result in a run time (segmentation fault) message.
Using the -static flag takes the memory space that would otherwise
be used for .so shared libraries, so is not compatible with using
shared libraries.
If the above makefile is foomake, you could execute it by the command line
(making sure the $(FC) lines start with tabs).
>make -f foomake
- Who should use the BLAS and LAPACK libraries.
The BLAS (Basic Linear Algebra Subroutines) and LAPACK (Linear Algebra
Package) are basic building blocks for many codes. The BLAS perform
such basic operations as innner products, matrix-vector and matrix-matrix
products. The LAPACK routines use the BLAS routines to perform
dense matrix operations such as LU decomposition to solve linear equation,
QR decomposition to solve least square problems, and also singular value
and eigen problems.
Advantages of using the LAPACK and BLAS libraries are in having portable
fast code. Fortran (C is also possible with a bit more fiddling) codes
calling LAPACK and BLAS can be ported easily to a variety of archectectures.
The code is high quality, giving not only good performance, but also
handling exceptional cases and avoiding numeric under and overflow.
For problems too large to fit in cache, LAPACK codes often run in one third
or less of the time required by the predecessor packages EISPACK and LINPACK.
(For small matrices, size a few hundred or less, EISPACK and LINPACK
may sometimes be faster, having fewer levels of subroutine).
For solving a 12K by 12K linear system on a single processor, a user found that the Numeric
Recipes solver timed out after ten hours. The LAPACK solver dgesv required
fifteen minutes with the PGI compiler and PGI supplied BLAS, and about ten minutes
with the Intel compiler and BLAS.
The efficiency of the implementation depends mainly on the quality of the
underlying BLAS library. Good BLAS implementations allow matrix matrix
multiplications and such operations as LU and QR decomposition to run
at near the peak CPU clock rates.
Speeds obtained by downloading the standard BLAS source code and compiling
it are slower than for a tuned library. Several tuned BLAS and
LAPACK libraries are available on the blade cluster. Instructions on linking
to the Intel, PGI, and Atlas versions of the library are included above.
- Who should not use the BLAS and LAPACK libraries?
Many scientific computations solve large sparse linear systems. "Sparse" means
that most elements of the matrix are zero so that only the nonzero elements are
stored, enabling larger matrices to fit in the available memory. LAPACK
is designed for dense and banded, but not sparse matrices. For a large
sparse matrix for which nonzero elements are not comfined to a small
number of sub and superdiagonals, it makes more sense to use a solver
explicitly designed for sparse matrix equations.
Two of the
best known publically available sparse solver packages are SuperLU (using sparse Gaussian
elimination) and PETSc (incorporating iterative solvers, but also assuming
that LAPACK has been installed). For advice on choosing
an appropriate package, or other questions, you can e-mail Gary Howell, gary_howell@ncsu.edu.
For more information
The lecture
CSC 783 lecture on BLAS
notes are a more complete introduction to the BLAS and include
links to the standards document and to reference BLAS.
The file
samplelapack.tar
has example files showing how to use LAPACK to solve linear equations and
linear least squares problems. The tar file can be unpacked by typing
>tar xvf samplelapack.tar
at the command line.
The LAPACK User's Guide can be purchased from SIAM press, and is also
available online at LAPACK
User's Guide
|