Differences between revisions 5 and 6
Revision 5 as of 2006-09-20 09:13:24
Size: 3369
Editor: planck
Comment:
Revision 6 as of 2006-09-20 12:11:58
Size: 3819
Editor: planck
Comment:
Deletions are marked like this. Additions are marked like this.
Line 33: Line 33:
AMD Core Math Library (ACML)
============================

`ACML <http://developer.amd.com/acml.aspx>`_ contains Basic Linear Algebra Subroutines (BLAS),
a full suite of Linear Algebra (LAPACK), FFT plus more,
more details `here <http://developer.amd.com/acml.aspx>`_.

On the cluster the following ACML libraries are currently installed::

   /opt/acml3.5.0/gnu64
   /opt/acml3.5.0/pathscale64
   /opt/acml3.5.0/pathscale64_mp

   

Cluster software

Cloning of nodes

The NIFLHEIM cluster uses the SystemImager toolkit on a central server to create an image of a Golden Client node that has been installed in the usual way using a distribution on CD-ROM (Centos Linux in our case). The SystemImager is subsequently used to install identical images of the Golden Client on all of the nodes (changing of course hostname and network parameters).

Pathscale compilers

On the frontend computer slid the Pathscale compiler suite is installed. This includes:

  • C (pathcc) and C++ (pathCC)
  • Fortran (pathf90 and pathf77)

Libraries and include files for the compilers are installed on the compute nodes in:

/opt/pathscale

AMD Core Math Library (ACML)

ACML contains Basic Linear Algebra Subroutines (BLAS), a full suite of Linear Algebra (LAPACK), FFT plus more, more details here.

On the cluster the following ACML libraries are currently installed:

/opt/acml3.5.0/gnu64
/opt/acml3.5.0/pathscale64
/opt/acml3.5.0/pathscale64_mp

NetCDF (network Common data Form)

NetCDF (network Common Data Form) is an interface for array-oriented data access and a library that provides an implementation of the interface. The netCDF library also defines a machine-independent format for representing scientific data.

NetCDF is installed on the cluster via the rpm netcdf-3.6.1-1.2.el4.fys. This build includes bindings to the PathScale fortran compiler. Spec file for the rpm is in:

~rpmbild/SPECS/netcdf.spec

OpenMPI (A High Performance Message Passing Library)

OpenMPI is a open source implementation of the MPI message passing standard.

The version now installed is 1.1.1 and is build witj suppport for:

* Torque (installation in /usr/local)
* gcc c compiler
* pathscale fortran compiler

The version on the cluster is installed using the rpm openmpi-1.1.1-2. This rpm is build using the buildrpms.sh script from the page.

This is done by modifying the buildrpms.sh script. Change the following lines:

prefix="/usr/local"
configure_options="--with-tm=/usr/local FC=pathf90 F77=pathf90"

buildrpms.sh is used to build one single rpm, including man pages:

build_srpm=no
build_single=yes
build_multiple=no

The build of multiple rpms did not seem to work in version 1.1.1.

buildrpms.sh needs the rpm spec file, this can be copied from the unpacked openmpi tar file:

cp openmpi-1.1.1/contrib/dist/linux/openmpi.spec .

Now run:

./buildrpms.sh openmpi-<version>.tar.gz

Processor affinity

For some parallel libraries like BLACS and ScaLAPACK which are using OpenMPI as the communication layer, setting processor affinity seems to be important. Processor affinity is when a process is fixed on a specific processor, see more details in this OpenMPI FAQ.

To enable processor affinity use:

mpirun --mca mpi_paffinity_alone 1 -np 4 a.out

Without this option all copies of the parallel program (siesta) would land on one CPU.

Niflheim: Cluster_software (last edited 2010-11-04 12:56:52 by OleHolmNielsen)