================================= Home storage vs Scratch storage ================================= Niflheim provides two main types of user storage: **home storage** and **scratch storage**. These file systems are configured the same way in terms of access and day-to-day use. The key difference is **backup**. Home storage (backed up) ======================== Use your home area (for example ``/home/niflheim`` or ``/home/energy``) for data that should be preserved and can be difficult or time-consuming to recreate. Typical examples include: * Source code and scripts * Job submission files * Configuration files * Notes, documentation, and small analysis results * Curated datasets you intend to keep * Final figures and processed output Because home storage is backed up, it is the right place for important files and long-term project data. Scratch storage (not backed up) =============================== Use scratch areas for temporary or high-volume data generated during active work or for multi-node applications which require the use of scratch files. We provide scratch file spaces at:: /home/scratch3/$USER/ # CAMD, CatTheory, Energy groups /home/scratch11/$USER/ # Construct/MEK group /home/scratch12/$USER/ # Energy group Typical use of scratch include: * Running calculations and simulations * Intermediate files * Large temporary datasets * Output that still needs analysis **REMEMBER:** There is **no backup** of files!! Lost files cannot be recovered by any means! Please remember to clean up scratch files regularly when they are no longer needed. .. _compute_node_tmp: Local disk on compute nodes (``/tmp``) ====================================== Each compute node also provides local temporary disk space mounted at ``/tmp`` and users are kindly requested to configure job scripts to use the compute nodes' ``/tmp`` folder for any temporary files in the job. This storage is intended for temporary files used during a running job, for example: * Fast temporary read/write data * Application scratch files * Temporary intermediate files needed only while the job runs Files in ``/tmp`` are local to the compute node, are not shared across nodes, and should be considered temporary. Important data should be copied elsewhere before the job finishes. The use of ``/tmp`` is thus only applicable for multinode jobs if the software is able to handle node-local scratch folders. Much software implements use of the local disk by setting an environment variable via a job script command:: export TMPDIR=/tmp This ``$TMPDIR`` setting is the default value in many computer codes and may not need to be set explicitly, check the software documentation. See :ref:`tmp_example` for examples on how to use ``/tmp`` actively. Notes: * On the **login nodes** you **must not** use ``/tmp`` for large files! Please use instead the local ``/scratch/$USER`` folder. Technical details: * Each Slurm job automatically allocates a **temporary** ``/tmp`` disk space which is private to the job in question. * This temporary disk space lives only for the duration of the Slurm job, and is automatically deleted when the job terminates. * This temporary disk space is actually allocated on the compute node's local ``/scratch`` disk, the size of which is specified under the :ref:`compute_node_partitions` section. * The **local node scratch disk space** is **shared** between all Slurm jobs currently running on the node. * The ``xeon32_4096`` nodes are equipped with a very large (14 TB) and very fast scratch file system. Large scratch spaces are typically required by big-memory jobs. Recommended workflow ==================== #. Run jobs and store temporary output in scratch storage. #. Use ``/tmp`` for node-local temporary files during execution when appropriate. #. Review and analyze the results. #. Keep only the valuable or final data. #. Move curated results to your home or project storage for long-term retention. Examples ======== Use a folder under ``/home/scratchX`` for calculation output ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Use a mirrored folder under ``/home/scratchX/username`` for calculation outputs with the same relative path as under ``/home/energy/username``, e.g. ``/home/energy/username/calc/folder`` turns into ``/home/scratchX/username/calc/folder``. .. code-block:: bash #!/bin/bash # Slurm instructions #SBATCH ... # The scratch base folder SCRATCHDIR='/home/scratch12/' current="$PWD" home="$HOME" # Remove $HOME prefix relative="${current#$home/}" export CALCDIR="$SCRATCHDIR/$relative" mkdir -p $CALCDIR # Copy input files to calculation folder cp ${SLURM_SUBMIT_DIR}/{INCAR,POSCAR,POTCAR,KPOINTS} $CALCDIR/ cd $CALCDIR # Load modules module purge module load VASP/6.6.0-intel-2025b # Run the calculation srun vasp_std > vasp.out .. _tmp_example: Using /tmp as calculation directory +++++++++++++++++++++++++++++++++++ The compute nodes' local disk is by far the fastest option for temporary or output files. .. code-block:: bash #!/bin/bash # Slurm instructions #SBATCH ... # Set calc dir export TMPDIR='/tmp' cp ${SLURM_SUBMIT_DIR}/{INCAR,POSCAR,POTCAR,KPOINTS} $TMPDIR/ cd $TMPDIR module ... srun vasp_std > vasp.out Using ASE +++++++++ With ASE the submit script differs slightly. .. code-block:: bash #!/bin/bash # Slurm instructions #SBATCH ... module ... module load ASE/3.28.0-iimkl-2025b python3 In the python script you need to tell the calculator where to put temporary files or save calculation results, the exact method may differ for other calculators. .. code-block:: python from pathlib import Path # Using a folder on a scratch server SCRATCH_FOLDER = Path('/home/scratchX//') relative = Path.cwd().relative_to(Path.home()) calc_dir = SCRATCH_FOLDER / relative # Using /tmp calc_dir = '/tmp' # Set output dir in VASP calculators calc = Vasp(atoms=..., directory=calc_dir, ...) # Saving output with GPAW calc.attach(calc.write, 5, calc_dir / 'xyz.gpw') Copying results from /tmp to home folder ++++++++++++++++++++++++++++++++++++++++ Since contents of /tmp is deleted after the job ends, relevant output should be copied to a user folder when the job ends. .. code-block:: bash #!/bin/bash # Slurm instructions #SBATCH ... # Set calc dir export TMPDIR='/tmp' # Copy input files to calculation folder cp ${SLURM_SUBMIT_DIR}/{INCAR,POSCAR,POTCAR,KPOINTS} $TMPDIR/ cd $TMPDIR module ... function term_handler() { for f in OUTCAR vasp.out ; do cp $TMPDIR/$f $SLURM_SUBMIT_DIR/; done } trap term_handler TERM srun vasp_std > vasp.out # or for a python script # python3 Copying files back at specific time +++++++++++++++++++++++++++++++++++ Note that Slurm only gives the TERM handler 30 seconds to finish its tasks, e.g. copy back files. If this is too little for the file transfer to finish you can signal to copy back at a specific time. .. code-block:: bash #!/bin/bash # Slurm instructions #SBATCH ... #SBATCH --signal=B:USR1@125 # Signalling USR1 125 seconds before the walltime ends function term_handler() ... trap term_handler USR1 # We need to do the calculation in a background process and wait # while it runs srun vasp_std > vasp.out & # or for a python script # python3 & PID=$! while kill -0 "$PID" 2>/dev/null; do wait "$PID" done If we only had wait and nothing else the bash wait command could get interrupted by the signal and then move to the next line. The while loop actively checks if the process is still running and waits for it. From the bash manual (https://www.gnu.org/software/bash/manual/html_node/Signals.html): If Bash is waiting for a command to complete and receives a signal for which a trap has been set, it will not execute the trap until the command completes. If Bash is waiting for an asynchronous command via the wait builtin, and it receives a signal for which a trap has been set, the wait builtin will return immediately with an exit status greater than 128, immediately after which the shell executes the trap