SUNCAT

(NOTE: With MKL 10.3 we have seen hangs in the early mpireduce calls for a small number of calculations. Until I have understood this I am backing out to MKL 10.2.)

At SLAC we compiled GPAW for RHEL5 x86_64, on intel Xeon 5650 with intel compilers and mkl. This improved the 8-core performance benchmark by 13% compared to the opencc/ACML approach.

Package Version
python 2.4
gpaw 0.8.0.7419
ase 3.5.0.1919
numpy 1.4.1
openmpi 1.4.3
mkl 10.3
intel compilers 11.1 (includes mkl 10.2 by default)

openmpi

openmpi was built with the intel compilers as follows:

$ ./configure --prefix=/nfs/slac/g/suncatfs/sw/gpawv15/install CC=icc CXX=icpc F77=ifort FC=ifort
$ make
$ make install

numpy

Build in usual fashion. At the moment we use default gnu compilers for numpy, since gpaw performance benchmark drops by 3% when it is built with icc/mkl/dotblas, for reasons that are not understood. Also, some gpaw self-tests start to fail.

gpaw

For this we use customize_mkl10.3.py:

scalapack = False

compiler = 'icc'
libraries =['mkl_rt','pthread','m']

library_dirs = ['/nfs/slac/g/suncatfs/sw/external/intel11.1/openmpi/1.4.3/install/lib','/afs/slac/package/intel_tools/2011u8/mkl/lib/intel64/']

include_dirs += ['/nfs/slac/g/suncatfs/sw/external/numpy/1.4.1/install/lib64/python2.4/site-packages/numpy/core/include']

extra_link_args += ['-fPIC']

extra_compile_args = ['-I/afs/slac/package/intel_tools/2011u8/mkl/include','-xHOST','-O1','-ipo','-no-prec-div','-static','-std=c99','-fPIC']

mpicompiler = 'mpicc'
mpilinker = mpicompiler

Note that this customize.py works only with MKL version 10.3 which has simplified linking.

The environment settings (valid at SUNCAT) to be able to link and run:

#!/bin/bash
export EXTERNALDIR=/nfs/slac/g/suncatfs/sw/external
export NUMPYDIR=${EXTERNALDIR}/numpy/1.4.1/install/lib64/python2.4/site-packages
export SCIPYDIR=${EXTERNALDIR}/scipy/0.7.0/install/lib64/python2.4/site-packages
export ASEBASE=${EXTERNALDIR}/ase/3.5.0.1919/install
export ASEDIR=${ASEBASE}/lib/python2.4/site-packages
export INTELDIR=/afs/slac/package/intel_tools/2011u8
export MKLDIR=${INTELDIR}/mkl/lib/intel64
export OPENMPIDIR=${EXTERNALDIR}/intel11.1/openmpi/1.4.3/install

export MKL_THREADING_LAYER=MKL_THREADING_SEQUENTIAL

export OMP_NUM_THREADS=1
export INSTALLDIR=${GPAW_HOME}/install
export PYTHONPATH=${ASEDIR}:${SCIPYDIR}:${NUMPYDIR}:${INSTALLDIR}/lib64/python
export PATH=/bin:/usr/bin:${OPENMPIDIR}/bin:${INTELDIR}/bin:${INSTALLDIR}/bin:${ASEBASE}/bin
export LD_LIBRARY_PATH=${INSTALLDIR}/lib:${MKLDIR}:${INTELDIR}/lib/intel64:${OPENMPIDIR}/lib:${MKLDIR}/../32
export GPAW_SETUP_PATH=${EXTERNALDIR}/gpaw-setups-0.6.6300

MKL 10.2 Notes

For historical reasons, we also include the customize.py for MKL 10.2:

scalapack = False

compiler = 'icc'
libraries =['mkl_intel_lp64','mkl_sequential','mkl_cdft_core','mkl_core','pthread','m']

library_dirs = ['/nfs/slac/g/suncatfs/sw/external/intel11.1/openmpi/1.4.3/install/lib','/afs/slac/package/intel_tools/compiler11.1/mkl/lib/em64t/']

include_dirs += ['/nfs/slac/g/suncatfs/sw/external/numpy/1.4.1/install/lib64/python2.4/site-packages/numpy/core/include']

extra_link_args += ['-fPIC']

extra_compile_args = ['-I/afs/slac/package/intel_tools/compiler11.1/mkl/include','-xHOST','-O1','-ipo','-no-prec-div','-static','-std=c99','-fPIC']

define_macros =[('GPAW_NO_UNDERSCORE_CBLACS', '1'), ('GPAW_NO_UNDERSCORE_CSCALAPACK', '1')]

mpicompiler = 'mpicc'
mpilinker = mpicompiler

This older version requires a fairly bad hack to make it work in all cases:

$ setenv LD_PRELOAD libmkl_core.so:libmkl_sequential.so

I believe this is because python uses “dlopen” for shared libraries, which has troubles with the circular dependencies present in MKL 10.2.

This hack can cause (ignorable) errors from unrelated commands like “ping” which prevents the use of LD_PRELOAD for security reasons.