Asap on Niflheim
Asap version 3.2.2 is installed on the "new" Niflheim nodes, i.e. the ones accessed from fjorm or thul.
To use Asap, you need to load the relevant modules in your .cshrc or .bashrc file:
module load NUMPY module load ASAP
For more information about modules on Niflheim, read the description on the Niflheim wiki.
Please check that you can load Asap, and that you get the version you expect:
[thul] ~$python Python 2.4.3 (#1, Jul 27 2009, 17:56:30) [GCC 4.1.2 20080704 (Red Hat 4.1.2-44)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import asap3 >>> asap3.print_version() ASAP version 3.2.2 serial, compiled 2009-08-31 15:10 on thul.fysik.dtu.dk using 'icpc -O3 -g -xS -fPIC -vec_report0 -DSTACKTRACE=linux' >>> >>> >>> asap3.print_version(1) ASAP version 3.2.2 serial, compiled 2009-08-31 15:10 on thul.fysik.dtu.dk using 'icpc -O3 -g -xS -fPIC -vec_report0 -DSTACKTRACE=linux' Python module: /opt/campos-asap3/3.2.2/1.el5.fys.ifort.11.0.python2.4.openmpi.1.3.3/lib64/python2.4/site-packages/asap3/__init__.pyc C++ module: /opt/campos-asap3/3.2.2/1.el5.fys.ifort.11.0.python2.4.openmpi.1.3.3/lib64/python2.4/site-packages/asapserial3.so ase module: /opt/campos-ase3/126.96.36.1996/1.el5.fys.python2.4/lib64/python2.4/site-packages/ase/__init__.pyc >>>
The second form also shows where Asap is installed.
Since the developers use their own installation, there is a risk that the default installation is outdated. If you discover that the installed version is ancient, complain to Jakob Schiøtz.
From ASAP 3.2.5, a special program, asap-qsub, is used to submit ASAP jobs to Niflheim. You specify PBS options on the command line or in comments beginning with #PBS, just as when submitting with the ordinary qsub command. The special asap-qsub command detects if you are submitting serial or parallel jobs, and submits a helper script starting your simulation in the right way.
[thul] ~...Asap/Test$asap-qsub TestAll.py JOB ID: 69751.ymer.fysik.dtu.dk job 69751 requires 1 proc for 2:15:00 Earliest start in 00:00:00 on Sun Dec 6 19:49:27 Earliest completion in 2:15:00 on Sun Dec 6 22:04:27 Best Partition: DEFAULT
Note how asap-qsub prints an estimate of when the job will start (in this case it started immediately). In the rare situations where the queue is suspended, a cryptic error message will be printed instead, but the job will usually have been submitted to the queue.
The TestAll.py begins like this:
""" A simple test script. Runs and times all scripts named ``*.py``. The tests that execute the fastest will be run first. """ #PBS -m ae #PBS -q medium
Note that after the (optional) doc strings, two comments specify the PBS queue and the mail options (ae = send mail After completion and in case of Error). See the Batch page on the Niflheim wiki for more information.
Serial simulations are submitted as described above. Please submit serial scripts to the opteron nodes, as the 8-CPU Xeon nodes should preferentially be used by jobs using all eight cpus.
Parallel simulations are submitted in exactly the same way. Your script will be started on all the requested CPUs, using the same version of MPI as the one used when compiling Asap. Just be sure to request the right number of processors using the -l nodes qsub option, either on the asap-qsub command line or as a #PBS comment in the Python script.
If you submit to the Intel Xeon processors, you should aim to use a number of processor divisible by eight, and specify ppn=8 to force the queueing system to allocate CPUs on the same physical machines. Similarly, if submitting to the AMD processors aim at a number of CPUs divisible by four, and specify ppn=4.
The two node lines below both give you 12 processors, but the second version guarantees that they are located on only three physical machines. This gives better performance, not only because it minimizes network communication, but also because Asap runs faster if it is not competing for memory bandwidth with a GPAW or dacapo process. In addition, you block fewer nodes for other users.
#PBS -l nodes=12:opteron
#PBS -l nodes=3:opteron:ppn=4
If your script contains the AsapThreads() command, it will run multithreaded on the multi-CPU machines. Be sure to reserve all CPUs on the node, for example by specifying the nodes like this:
#PBS -l nodes=1:opteron:ppn=4
Important: Be sure to check that you get an increased performance corresponding to the increased use of computer resources! With Asap 3.2 you are unlikely to get acceptable performance using four or eight processors, the script may run better on only two processors. To do that, replace ppn=4 with ppn=2 and replace AsapThreads() by AsapThreads(2). You should still check that it runs close to twice as fast as on a single CPU!
asap-qsub automatically detects the presence of the AsapThreads() command, and will only start a single Python process on each node (instead of one on each processor).
Combining multithreading and MPI parallelization is not supported by the current version of asap-niflheim (3.2.2), as the performance is significantly worse than for a normal parallel simulation. Do not use ``AsapThreads()`` in parallel simulations.
This will hopefully change in a future version of Asap.
asap-qsub examines the processor specification to detect the number of nodes and cores, and examines the script itself to detect if AsapThreads is called. If it guesses wrongly, it can be overridden. The only known cases where asap-qsub is confused is if AsapThreads() is called in an imported module (it will not be detected); or if an AsapThreads() is present in the script, but never executed, perhaps because it is in an if statement.
The default asap-qsub behaviour can be changed by the option --ASAP=X where X is one or more of the following letters.
- Multi-threading forced off.
- Multi-threading forced on.
- Serial simulation (will be executed with python script.py)
- Parallel simulation (will be executed with mpirun asap-python script.py)
Instead of specifying these on the asap-qsub command line, they can also be specified in the script with an #ASAP command:
#PBS -m ae #PBS -q medium ## Tell asap-qsub that there is per default no threading. #ASAP N