Submitting jobs on the DTU computers¶

Smaller calculations can be run in a Jupyter Notebook, but larger calculations require running on multiple CPU cores for an extended time. Such jobs should be submitted with the MyQueue tool. MyQueue is a unified frontend for a number of different queuing systems available on HPC installations. It supports submitting individual jobs as well as complete workflows.

Using MyQueue¶

The command mq acts as a front-end to the queue system. Usage:

mq submit -R CORES:TIME script

Submit a GPAW Python script via the configured queueing system.

positional arguments:

script:: Python script
argument:: Command-line argument for Python script.

selected optional arguments:

-h, --help: show help message and exit
-n NAME, --name NAME: Name used for task.
-R RESOURCES, --resources RESOURCES: Examples: “8:1h”, 8 cores for 1 hour. Use “m” for minutes, “h” for hours and “d” for days. “16:1:30m”: 16 cores, 1 process, half an hour.
-z, --dry-run: Show what will happen without doing anything.
-v, --verbose: More output.
-q, --quiet: Less output.

$ mq submit -R 8:4h script.py  # 8 cores, 4 hours
$ mq list
$ qstat hpc
...

The last command shows the user’s jobs in the hpc queue, which is the queue we use for the summer school. mq list and qstat hpc give some of the same information.

Choosing the number of processes¶

GPAW parallelizes most efficiently over k-points, so it is a good idea to make the number of processes a divisor of the number of irreducible k-points. If you have 12 irreducible k-points, the calculation parallelizes well on 2, 3, 4, 6 or 12 processes.

If you have very few irreducible k-points you may need to have more processes than k-points; in these cases GPAW choose other parallelization strategies. In this case, it is an advantage to make the number of processes a multiple of the number of irreducible k-points.

Dry run: Let GPAW help you choosing¶

If you run your script with the command:

$ gpaw python --dry-run=1 myscript.py

then your script will execute until the first GPAW calculation. That calculation will print information into the .txt file, and then stop. In the file, you can see the number of irreducible k-points and use it to select your parallelization strategy.

Once you have decided how many processes you want, run another dry-run to check how GPAW will parallelize:

$ gpaw python --dry-run=PROCESSES myscript.py

where PROCESSES is the number of processes you want to use. In this case, gpaw will print how it will parallelize the calculation when running for real.