Home storage vs Scratch storage

Niflheim provides two main types of user storage: home storage and scratch storage. These file systems are configured the same way in terms of access and day-to-day use. The key difference is backup.

Home storage (backed up)

Use your home area (for example /home/niflheim or /home/energy) for data that should be preserved and can be difficult or time-consuming to recreate.

Typical examples include:

  • Source code and scripts

  • Job submission files

  • Configuration files

  • Notes, documentation, and small analysis results

  • Curated datasets you intend to keep

  • Final figures and processed output

Because home storage is backed up, it is the right place for important files and long-term project data.

Scratch storage (not backed up)

Use scratch areas for temporary or high-volume data generated during active work or for multi-node applications which require the use of scratch files. We provide scratch file spaces at:

/home/scratch3/$USER/         # CAMD, CatTheory, Energy groups
/home/scratch11/$USER/        # Construct/MEK group
/home/scratch12/$USER/        # Energy group

Typical use of scratch include:

  • Running calculations and simulations

  • Intermediate files

  • Large temporary datasets

  • Output that still needs analysis

REMEMBER: There is no backup of files!! Lost files cannot be recovered by any means!

Please remember to clean up scratch files regularly when they are no longer needed.

Local disk on compute nodes (/tmp)

Each compute node also provides local temporary disk space mounted at /tmp and users are kindly requested to configure job scripts to use the compute nodes’ /tmp folder for any temporary files in the job.

This storage is intended for temporary files used during a running job, for example:

  • Fast temporary read/write data

  • Application scratch files

  • Temporary intermediate files needed only while the job runs

Files in /tmp are local to the compute node, are not shared across nodes, and should be considered temporary. Important data should be copied elsewhere before the job finishes. The use of /tmp is thus only applicable for multinode jobs if the software is able to handle node-local scratch folders.

Much software implements use of the local disk by setting an environment variable via a job script command:

export TMPDIR=/tmp

This $TMPDIR setting is the default value in many computer codes and may not need to be set explicitly, check the software documentation. See Using /tmp as calculation directory for examples on how to use /tmp actively.

Notes:

  • On the login nodes you must not use /tmp for large files! Please use instead the local /scratch/$USER folder.

Technical details:

  • Each Slurm job automatically allocates a temporary /tmp disk space which is private to the job in question.

  • This temporary disk space lives only for the duration of the Slurm job, and is automatically deleted when the job terminates.

  • This temporary disk space is actually allocated on the compute node’s local /scratch disk, the size of which is specified under the Compute node partitions section.

  • The local node scratch disk space is shared between all Slurm jobs currently running on the node.

  • The xeon32_4096 nodes are equipped with a very large (14 TB) and very fast scratch file system. Large scratch spaces are typically required by big-memory jobs.

Examples

Use a folder under /home/scratchX for calculation output

Use a mirrored folder under /home/scratchX/username for calculation outputs with the same relative path as under /home/energy/username, e.g. /home/energy/username/calc/folder turns into /home/scratchX/username/calc/folder.

#!/bin/bash

# Slurm instructions
#SBATCH ...

# The scratch base folder
SCRATCHDIR='/home/scratch12/<USER>'

current="$PWD"
home="$HOME"

# Remove $HOME prefix
relative="${current#$home/}"

export CALCDIR="$SCRATCHDIR/$relative"
mkdir -p $CALCDIR

# Copy input files to calculation folder
cp ${SLURM_SUBMIT_DIR}/{INCAR,POSCAR,POTCAR,KPOINTS} $CALCDIR/
cd $CALCDIR

# Load modules
module purge
module load VASP/6.6.0-intel-2025b

# Run the calculation
srun vasp_std > vasp.out

Using /tmp as calculation directory

The compute nodes’ local disk is by far the fastest option for temporary or output files.

#!/bin/bash

# Slurm instructions
#SBATCH ...

# Set calc dir
export TMPDIR='/tmp'

cp ${SLURM_SUBMIT_DIR}/{INCAR,POSCAR,POTCAR,KPOINTS} $TMPDIR/
cd $TMPDIR

module ...

srun vasp_std > vasp.out

Using ASE

With ASE the submit script differs slightly.

#!/bin/bash

# Slurm instructions
#SBATCH ...

module ...
module load ASE/3.28.0-iimkl-2025b

python3 <script.py>

In the python script you need to tell the calculator where to put temporary files or save calculation results, the exact method may differ for other calculators.

from pathlib import Path

# Using a folder on a scratch server
SCRATCH_FOLDER = Path('/home/scratchX/<USER>/')
relative = Path.cwd().relative_to(Path.home())
calc_dir = SCRATCH_FOLDER / relative

# Using /tmp
calc_dir = '/tmp'

# Set output dir in VASP calculators
calc = Vasp(atoms=...,
          directory=calc_dir,
          ...)

# Saving output with GPAW
calc.attach(calc.write, 5, calc_dir / 'xyz.gpw')

Copying results from /tmp to home folder

Since contents of /tmp is deleted after the job ends, relevant output should be copied to a user folder when the job ends.

#!/bin/bash

# Slurm instructions
#SBATCH ...

# Set calc dir
export TMPDIR='/tmp'

# Copy input files to calculation folder
cp ${SLURM_SUBMIT_DIR}/{INCAR,POSCAR,POTCAR,KPOINTS} $TMPDIR/
cd $TMPDIR

module ...

function term_handler() {
    for f in OUTCAR vasp.out <other-important-files>; do
        cp $TMPDIR/$f $SLURM_SUBMIT_DIR/;
    done
}

trap term_handler TERM

srun vasp_std > vasp.out
# or for a python script
# python3 <script.py>

Copying files back at specific time

Note that Slurm only gives the TERM handler 30 seconds to finish its tasks, e.g. copy back files. If this is too little for the file transfer to finish you can signal to copy back at a specific time.

#!/bin/bash

# Slurm instructions
#SBATCH ...
#SBATCH --signal=B:USR1@125  # Signalling USR1 125 seconds before the walltime ends

function term_handler() ...

trap term_handler USR1

# We need to do the calculation in a background process and wait
# while it runs
srun vasp_std > vasp.out &
# or for a python script
# python3 <script.py> &
PID=$!

while kill -0 "$PID" 2>/dev/null; do
    wait "$PID"
done

If we only had wait and nothing else the bash wait command could get interrupted by the signal and then move to the next line. The while loop actively checks if the process is still running and waits for it. From the bash manual (https://www.gnu.org/software/bash/manual/html_node/Signals.html):

If Bash is waiting for a command to complete and receives a signal for which a trap has been set, it will not execute the trap until the command completes. If Bash is waiting for an asynchronous command via the wait builtin, and it receives a signal for which a trap has been set, the wait builtin will return immediately with an exit status greater than 128, immediately after which the shell executes the trap