|
How do I link to a BLAS or LAPACK library on the Linux cluster? ...
- A Sample make file for the intel ifort compiler
- Sample make files for the Portland Group pgf90 compiler
- A Sample make file for the gnu gfortran compiler
- Who should use the BLAS and LAPACK libraries?
- Who should not use the BLAS and LAPACK libraries?
- For more information
- A Sample make file for the intel ifort compiler.
(Draft revision as the older one was clearly misleading .. this one
still needs some checking -- Jan. 30, 2012 )
Add the Intel environment by the command line
>add intel64_hydra
The following make file would link to the
BLAS library.
- LIBDIR = /usr/local/intel/mkl91023/lib/em64t
- LIBS = -L${LIBDIR} -lmkl_em64t -lmkl_lapack -lguide -lpthread
- FC = ifort
- FFLAGS = -O3 -tpp7 -static
- foo:
- $(FC) $(FFLAGS) -c foo.f
- $(FC) $(FFLAGS) -o foo foo.o ${LIBS}
The -static flag
allows matrices using up to 231 - 1 bytes (one byte less than 2 Gigabytes).
Using the -static flag takes the memory space that would otherwise
be used for .so shared libraries. For larger matrices, use a -i8 flag when compiling (so that 8 byte integers are used). For LIBDIR above, substitute
LIBDIR = /usr/local/intel/mkl91023/lib_ilp64/em64t.
If the above makefile is foomake, you could execute it by the command line
(making sure the $(FC) lines start with tabs).
>make -f foomake
- Sample make file for the Portland Group pgf90 compiler.
Similarly, to link the BLAS and LAPACK libraries when using the PGI Fortran compiler, first add the pgi environment by
>add pgi64_hydra
(if you've been using the Intel compiler environment, then log out and back in before the "add pgi64_hydra"). Then the following make file should work.
- LIBDIR = /usr/local/apps/acml/acml4.3.0/pgi64/lib
- LIBS = -L${LIBDIR1} -lacml -lacml_mv
- FC = pgf90
- foo:
- $(FC) -c foo.f
- $(FC) -o foo foo.o ${LIBS}
The link here is to the acml library, which is desinged for AMD processors,
but appears to also work acceptably well with the intel blades.
To allow larger matrices, use the -Bstatic flag. As with
the Intel compiler, the -Bstatic flag can not be used with an executable that
links to shared .so libraries. pgi subdirectories of /usr/local/apps/acml/acml4.3.0 are pgi64, pgi64_int64, pgi64_mp, pgi64_mp_int64. The int64 directories use 8 byte integers, allowing larger matrices. The mp directories will take advantage of shared memory to use multiple threads, enabling
shared memory parallelism (to use these, do "man pgf90"
and find compiler flags to use OpenMP).
4
- A Sample make file for the gnu gfortran compiler.
The following makefile links to an acml gfortran BLAS.
- FC = gfortran
- LIBDIR = /usr/local/apps/acml/acml4.3.0/gfortran64/lib
- LIBS = -L${LIBDIR} -lacml -lacml_mv -lblas
- FFLAGS = -static
- foo:
- $(FC) $(FFLAGS) -c foo.f
- $(FC) $(FFLAGS) -o foo foo.o ${LIBS}
If the above makefile is foomake, you could execute it by the command line
(making sure the $(FC) lines start with tabs).
>make -f foomake
- Who should use the BLAS and LAPACK libraries.
The BLAS (Basic Linear Algebra Subroutines) and LAPACK (Linear Algebra
Package) are basic building blocks for many codes. The BLAS perform
such basic operations as innner products, matrix-vector and matrix-matrix
products. The LAPACK routines use the BLAS routines to perform
dense matrix operations such as LU decomposition to solve linear equation,
QR decomposition to solve least square problems, and also singular value
and eigen problems.
Advantages of using the LAPACK and BLAS libraries are in having portable
fast code. Fortran (C is also possible with a bit more fiddling) codes
calling LAPACK and BLAS can be ported easily to a variety of archectectures.
The code is high quality, giving not only good performance, but also
handling exceptional cases and avoiding numeric under and overflow.
For problems too large to fit in cache, LAPACK codes often run in one third
or less of the time required by the predecessor packages EISPACK and LINPACK.
(For small matrices, size a few hundred or less, EISPACK and LINPACK
may sometimes be faster, having fewer levels of subroutine).
For solving a 12K by 12K linear system on a single processor, a user found that the Numeric
Recipes solver timed out after ten hours. The LAPACK solver dgesv required
fifteen minutes with the PGI compiler and PGI supplied BLAS, and about ten minutes
with the Intel compiler and BLAS.
The efficiency of the implementation depends mainly on the quality of the
underlying BLAS library. Good BLAS implementations allow matrix matrix
multiplications and such operations as LU and QR decomposition to run
at near the peak CPU clock rates.
This exposition is out of date in not going into more detail on how to use
the shared memory parallel mp libraries (allowing use of more than one core).
Also the distributed memory SCALAPACK (allowing use of distributed memory) libraries
have been incorporated into the newer mkl libraries.
Speeds obtained by downloading the standard BLAS source code and compiling
it are slower than for a tuned library. Several tuned BLAS and
LAPACK libraries are available on the blade cluster. Instructions on linking
to the Intel, PGI, and Atlas versions of the library are included above.
- Who should not use the BLAS and LAPACK libraries?
Many scientific computations solve large sparse linear systems. "Sparse" means
that most elements of the matrix are zero so that only the nonzero elements are
stored, enabling larger matrices to fit in the available memory. LAPACK
is designed for dense and banded, but not sparse matrices. For a large
sparse matrix for which nonzero elements are not comfined to a small
number of sub and superdiagonals, it makes more sense to use a solver
explicitly designed for sparse matrix equations.
Two of the
best known publically available sparse solver packages are SuperLU (using sparse Gaussian
elimination) and PETSc (incorporating iterative solvers, but also assuming
that LAPACK has been installed). For advice on choosing
an appropriate package, or other questions, you can e-mail Gary Howell, gary_howell@ncsu.edu.
- For more information
The lecture
CSC 783 lecture on BLAS
notes are a more complete introduction to the BLAS and include
links to the standards document and to reference BLAS.
The file
samplelapack.tar
has example files showing how to use LAPACK to solve linear equations and
linear least squares problems. The tar file can be unpacked by typing
>tar xvf samplelapack.tar
at the command line.
The LAPACK User's Guide can be purchased from SIAM press, and is also
available online at LAPACK
User's Guide
|
Last modified: January 30 2012 15:11:02.
Office of Information Technology |
NC State University |
Raleigh, NC 27695 |
Accessibility Statement |
Policy Disclaimer |
Contact Us
|
|