EasyBuild software for environment modules on CentOS

On the Niflheim7 cluster (CentOS 7 based) we use the modern Lmod and EasyBuild environment modules for software packages.

The old Niflheim setup (CentOS 6 based) use the old/classic environment modules as documented in https://wiki.fysik.dtu.dk/niflheim/Installed_software.

Lmod

EasyBuild works with older Tcl/C based module tools, but Lmod is recommended. From the Lmod homepage:

  • Lmod is a Lua based module system that easily handles the MODULEPATH Hierarchical problem. Environment Modules provide a convenient way to dynamically change the users' environment through modulefiles. This includes easily adding or removing directories to the PATH environment variable. Modulefiles for Library packages provide environment variables that specify where the library and header files can be found.

Lmod documentation:

Install Lmod

You must install Lmod on every node in your cluster. It is most convenient to install an Lmod RPM package on all nodes. If you don't have root permissions on the system, you can install Lmod as described in Installing Lmod without root permissions.

Install Lmod from the EPEL repository. First you install the newest version of epel-release RPM for EL7, for example:

CentOS: yum install epel-release
RHEL7:  yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Then install Lmod and prerequisite Lua packages:

yum install Lmod

To download the Lmod and prerequisite Lua packages directly (for compute nodes) get them from https://dl.fedoraproject.org/pub/epel/7/x86_64/l/. The Lua packages required are:

  • lua-bitop lua-filesystem lua-json lua-lpeg lua-posix lua-term

Using Lmod

The Lmod RPM package installs several shell initialization scripts in /etc/profile.d/. For bash the shell initialization process involves some steps:

  1. /etc/profile.d/z00_lmod.sh is called when the shell is started.
  2. This initializes module support by calling the script /usr/share/lmod/lmod/init/sh.
  3. This defines a shell function module().
  4. The shell function module() calls the main Lua program for Lmod: /usr/share/lmod/lmod/libexec/lmod.

Now the module "commands" (functions) can be used:

module list
ml

To view the module command:

type module

To list all defined shell functions:

compgen -A function

Installing EasyBuild

EasyBuild itself should be used only by a dedicated account for building software modules.

We have a created a user+group named modules with a home-directory on a shared filesystem to be mounted by NFS on teh compute nodes: /home/opt/modules. For example:

root# groupadd -g 983 modules
root# useradd -m -c "Modules user" -d /home/opt/modules -u 983 -g modules -s /bin/bash modules

Now you should login or su - modules to the non-root user.

Bootstrapping

The steps required for a normal (non-root) user are:

  • Read the Installation page, especially the Bootstrapping procedure section.
  • If multiple module tools are available on the system, it may be necessary to configure the use of Lmod (see the Configuration page):

    export EASYBUILD_MODULES_TOOL=Lmod
  • Define the top-level directory for your modules, for example:

    export EASYBUILD_PREFIX=/home/opt/modules

    If your environment is inhomogeneous with different OS versions and/or CPU architectures, you could create separate subdirectories for each. For example, you may have both CentOS 7 (el7) and CentOS 6 (el6):

    export EASYBUILD_PREFIX=/home/opt/modules/el7/x86_64
    export EASYBUILD_PREFIX=/home/opt/modules/el6/i686

    Obviously, you would need to select somehow the appropriate top-level directory for each computer.

  • If you work on a PC, it is recommended to use a $EASYBUILD_PREFIX directory on the PC's local hard disk for performance reasons. An SSD disk will obviously speed up the tasks.

  • Download the bootstrap script:

    curl -O https://raw.githubusercontent.com/hpcugent/easybuild-framework/develop/easybuild/scripts/bootstrap_eb.py
  • Execute it specifying an installation prefix as an argument:

    python bootstrap_eb.py $EASYBUILD_PREFIX
  • Update $MODULEPATH, load the EasyBuild module, and check the basic functionality:

    module use $EASYBUILD_PREFIX/modules/all
    module load EasyBuild
    module list
    eb --version
  • You may run some tests (which take a long time):

    export TEST_EASYBUILD_MODULES_TOOL=Lmod
    python -m test.framework.suite

Some additional packages from EPEL may be needed, see Dependencies:

root# yum install GitPython pysvn graphviz

Updating EasyBuild

If a new version of EasyBuild should be installed, consult the Updating page.

The simplest way may be the new command in version 2.9.0 and later:

eb --install-latest-eb-release

The standard upgrading method is to download the bootstrap script and execute it as in the normal installation explained above. Then reload the EasyBuild module as shown above.

Using EasyBuild for module building

The following is only for module builders!

Add the following to the normal user's .bashrc file:

# EasyBuild setup
export EASYBUILD_MODULES_TOOL=Lmod
export EASYBUILD_PREFIX=/home/opt/modules   # Example directory
module use $EASYBUILD_PREFIX/modules/all
module load EasyBuild

Notice: Except for the last line, the modules environment can be set up for all users using /etc/profile.d/ files as shown below.

Read the Concepts_and_Terminology and command_line pages. See also the command help:

eb --help

Of particular interest is:

Global setup of modules for all users

Notice: Normal users of the modules do not need to load the EasyBuild module - this is only for module builders.

If desired the system administrator can set up shell initialization scripts so that all users automatically have the EasyBuild modules set up, see:

On CentOS systems the shell initialization scripts are in /etc/profile.d/. The Lmod RPM has installed several scripts here. See also the Lmod_User_Guide.

To set up the EasyBuild environment create in /etc/profile.d/ the file z01_EasyBuild.sh:

if [ -z "$__Init_Default_Modules" ]; then
 export __Init_Default_Modules=1
 export EASYBUILD_MODULES_TOOL=Lmod
 export EASYBUILD_PREFIX=/home/modules
 module use $EASYBUILD_PREFIX/modules/all
else
 module refresh
fi

and for tcsh z01_EasyBuild.csh:

if ( ! $?__Init_Default_Modules )  then
  setenv __Init_Default_Modules 1
  setenv EASYBUILD_MODULES_TOOL Lmod
  setenv EASYBUILD_PREFIX /home/modules
  module use $EASYBUILD_PREFIX/modules/all
else
  module refresh
endif

Obviously, the EASYBUILD_PREFIX location of modules is just an example - every site will use a different location, so configure this variable accordingly.

Setting the CPU hardware architecture

Some compilers will generate code for the CPU hardware on which it is executed, and this code may not run on older CPUs. This leaves sysadmins and users with two choices:

  1. Build modules on the oldest available CPU. This should run on newer CPUs, but performance will suffer because newer hardware isn't utilized well.
  2. Build separate module trees for each generation of CPUs, assuring that optimized code is generated. Centrally built modules can be NFS mounted so that only the CPU-specific module tree is made available.
  3. More complicated setups are suggested in the mailing list thread https://lists.ugent.be/wws/arc/easybuild/2016-09/msg00052.html
Determining the current CPU architecture

It is surprisingly difficult to determine the CPU hardware architecture of any given system for selecting hardware-optimized modules. A useful list of CPU-architectures is in the Safe_CFLAGS page.

We have found the following solutions:

  • Recommended: Ask the GCC compiler for the native architecture, for example:

    # module load GCC
    # gcc -march=native -Q --help=target | grep march | awk '{print $2}'
    haswell

    GCC version 4.9 or newer should be used in order to reveal processor codenames, since older GCC versions will output less informative names such as core2. Intel's Skylake processor is only recognized by GCC version 6 or newer.

    The output may be the Intel CPU codenames such as broadwell, haswell etc. See the CPU-specific Safe_CFLAGS.

  • Use the command lscpu to display the Model name (or look into /proc/cpuinfo).

As a convenience to normal users, the sysadmin may provide in /etc/profile.d/ the scripts cpu_arch.sh:

export CPU_ARCH="broadwell"

and cpu_arch.csh:

setenv CPU_ARCH "broadwell"

(for the example of broadwell CPUs) where the current host CPU-architecture has been determined by any of the above methods. Obviously, this may have to be set differently for different types of compute nodes.

Using the $CPU_ARCH variable users can easily select the correct CPU-architecture. For example, users may choose to select CPU-specific module trees:

export EASYBUILD_PREFIX=$HOME/$CPU_ARCH
module use $EASYBUILD_PREFIX/modules/all

Install common packages

See the List of supported software.

Some examples:

  • Atomic Simulation Environment (ASE):

    eb -S '^ASE*'

You can do a dry-run overview (typically combined with --robot, in the form of -Dr) using one of these flags:

  • eb --dry-run: Print build overview incl. dependencies (full paths) (def False)
  • eb -D, --dry-run-short: Print build overview incl. dependencies (short paths) (def False)
  • eb -x, --extended-dry-run: Print build environment and (expected) build procedure that will be performed (def False)

Notes:

  • The ASE module requires the openssl-devel and libibverbs-devel (Infiniband) RPMs (to be installed by the root user):

    root# yum install openssl-devel libibverbs-devel libX11-devel
  • If you build the Tk package, there is a TK_bug requiring you to preinstall the libX11-devel library:

    root# yum install libX11-devel

Uninstall a module

There is no automatic way to uninstall a module. Please see the discussion of Uninstall software. The reason is that if you remove some modules, there is (currently) no way to find out if other modules depend upon it.

The unsafe way to remove a module may be to locate the module file in your $MODULEPATH. Examine the module's root directory and remove the files belonging to the module. Finally remove the module file itself.

Using toolchains

A specific package may (should) be based upon on of the standard toolchains. Here we discuss the ones of interest to us.

foss toolchain

The foss toolchain provides GCC, OpenMPI, OpenBLAS/LAPACK, ScaLAPACK(/BLACS), FFTW.

The foss toolchain was introduced in an effort to promote some toolchains as common toolchains, where the hope was that several sites would pick up these toolchains so we could benefit from each others efforts even more (the same was done with the intel toolchain which was a renaming of 'ictce'). We revisit these toolchains under the <year>(a|b) versioning scheme every 6 months. (Quote).

Search for available foss toolchains:

eb -S ^foss

To build one of the foss toolchains:

eb foss-2016a.eb -r

Intel compiler toolchains

Read Using Environment Modules with Intel Development Tools (refers to old Tcl modules).

Search for the Intel compiler suite toolchains:

eb -S '^intel*'

The Intel® Parallel_Studio XE 2016 compiler (see Intel_Release_notes) installation tar-ball files must first be downloaded under your license with Intel. Download separately the tar-balls for ICC, IFORT and MKL.

As the modules user create source directories where EasyBuild is going to look:

mkdir -p $HOME/sources/icc $HOME/sources/ifort $HOME/sources/imkl

Then move the tar-balls into place:

mv parallel_studio_xe_2016_composer_edition_for_cpp_update3.tgz $HOME/sources/icc/
mv parallel_studio_xe_2016_composer_edition_for_fortran_update3.tgz $HOME/sources/ifort/
mv l_mkl_11.3.3.210.tgz $HOME/sources/imkl/

or alternatively make soft-links in these 3 directories pointing to where you put the tar-ball files, for example:

cd /home/modules/sources/i/icc
ln -s $HOME/Intel/parallel_studio_xe_2017_update4_composer_edition_for_cpp.tgz

The Intel compiler easyconfig files specify the location of the Intel license file as an absolute path (current directory or relative paths will not work):

license_file = HOME + '/licenses/intel/license.lic'

so it is recommended to use this convention and copy your site's private license file (for example, license.lic):

mkdir -p $HOME/licenses/intel
cp license.lic $HOME/licenses/intel/license.lic

If you use a FlexLM license manager server it is possible to use another approach (see https://lists.ugent.be/wws/arc/easybuild/2016-11/msg00008.html), for example:

env INTEL_LICENSE_FILE=28518@<license-server> eb ifort-xxx.eb

It is recommended that <license-server> is a CNAME address pointing in DNS to the real license server.

In summary:

  • Intel license_file must be specified. Select a long-term valid name.
  • Can be overridden by setting the environment variable INTEL_LICENSE_FILE before running eb.
  • Absolute path example: license_file = HOME + '/licenses/intel/license.lic'
  • License server example: license_file = port-number@license-server

If you have the full Intel® Parallel Studio XE 2016 Cluster Edition you can build the full Intel compiler toolchain (icc, ifort, impi, imkl):

eb intel-2016b.eb -r

To try out the latest Development_EB_files.

The current recommendation for Intel compilers may not be the latest 2017 version, but in stead version 2016 update 1:

parallel_studio_xe_2016_composer_edition_for_cpp_update1.tgz
parallel_studio_xe_2016_composer_edition_for_fortran_update1.tgz

It turns out that 2016 updates 2 and 3, and perhaps also the 2017 versions (?) may cause code crashes, see https://lists.ugent.be/wws/arc/easybuild/2016-11/msg00004.html

Intel iomkl toolchain

We have built a version of the iomkl toolchain using modified EB files with these steps:

eb icc-2016.3.210-GCC-5.4.0-2.26.eb iccifort-2016.3.210-GCC-5.4.0-2.26.eb ifort-2016.3.210-GCC-5.4.0-2.26.eb -r   # Build compiler modules
eb OpenMPI-1.10.3-iccifort-2016.3.210-GCC-5.4.0-2.26.eb                                                           # Only for Slurm support
eb iompi-2016.09-GCC-5.4.0-2.26.eb imkl-11.3.3.210-iompi-2016.09-GCC-5.4.0-2.26.eb -r                             # Build OpenMPI and MKL modules
eb iomkl-2016.09-GCC-5.4.0-2.26.eb -r                                                                             # Build iomkl toolchain

Here we have configured our pre-existing GCC-5.4.0 compiler together with the Intel compilers, which requires tweaking some Development_EB_files.

SLURM support in OpenMPI requires adding this line to the EB file:

configopts += '--with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr '  # Support of Slurm

Toolchain step by step guide

The step_by_step guide will guide you through putting together a self-contained compiler toolchain, and using that toolchain to build a software package.

EasyBuild repositories

Third-party EasyBuild repositories:

Contributing back

If you develop easyconfig files you can contribute them back to the community, see https://github.com/hpcugent/easybuild/wiki/Contributing-back.

Submitting pull requests (--new-pr)

See http://easybuild.readthedocs.io/en/latest/Integration_with_GitHub.html#github-new-pr

In its simplest form, you just provide the location of the file(s) that you want to include in the pull request:

$ eb --new-pr test.eb

But first you need to set up github integration!

Setting up github integration

To use eb --new-pr you need to link EasyBuild with your github account. You only need to do this once.

  1. Make an account on github.
  2. Set the environment variable EASYBUILD_GITHUB_USER to your github user name.
  3. On github, go to Setting and select Personal access tokens. Press the Generate new token button. Give the token a name (e.g. EasyBuild). Select access to repo and gist. Then press the green Generate token button.
  4. Run the command:

    $ eb --install-github-token

    and paste in the token at the prompt (it is treated as a password, and not displayed).

  5. Check that everything works with:

    $ eb --check-github