.. _tutorial: =================== Basic Task Creation =================== Initializing the repository --------------------------- Make new directory and initialize the workflow repository there:: > mkdir htw-workflow-bs > cd htw-workflow-bs > htw-util init Created repository at /home/askhl/htw-workflow-bs This will create a directory called `tree`, and a file called `tasks.py`. To get more information about the htw-util structure, use `htw-util info`:: askhl@computer:~/htw-workflow-bs$ htw-util info Root: /home/user/htw-workflow-bs Tree: /home/user/htw-workflow-bs/tree Registry: /home/user/htw-workflow-bs/registry.dat with 0 entries Tasks: /home/user/htw-workflow-bs/tasks.py We will create a workflow which relaxes the unit cell and geometry, to get a relaxed ground state, and subsequently does a band structure calculation to a system. Create a workflow file called `workflow.py`:: from ase.build import bulk def workflow(rn): atoms = bulk('Si') calculator = {'kpts': {'density': 1.0}, 'mode': 'pw', 'txt': 'gpaw.txt'} rn.task('relax', atoms=atoms, calculator=calculator) This defines a task called `relax` with atoms and calculator given as a parameter. The name `relax` is will be looked up through the tasks definedin the file tasks.py. However, the string defining the name of the task also accepts import paths, for example:: rn.task('asr.c2db.relax', atoms=atoms, calculator=calculator) However, without module paths, the default search path for the tasks will be in the tasks.py, as given by the `htw-util` info command. This allows users to define their own tasks, without them being a part of the asr source code. [We envision this will be possible via project wide lib-package, in the future.] Next step, is to write an actual task. Edit the file `tasks.py` to have following content relaxing the cell and geometry in a GPAW calculation:: from gpaw import GPAW from ase.optimize import BFGS from ase.constraints import ExpCellFilter def relax(atoms, calculator): calc = GPAW(**calculator) atoms.calc = calc with BFGS(ExpCellFilter(atoms), logfile='opt.log', trajectory='opt.traj') as opt: opt.run(fmax=0.01) return atoms.copy() Although this recipe shows a typical relaxation recipe written in GPAW, it can in principle be with any code. Note, that a copy of the atoms must be returned, since atoms with calculator cannot be returned (as calculator cannot be serialized). To get information about various commands, each subcommand of htw-util has --help parameter. For example, to get information about command ls, write. To get list of commands write htw-util --help. user@computer:~/htw-workflow-bs$ htw-util ls --help Usage: htw-util ls [OPTIONS] [TREE]... List tasks under directory TREEs. Find tasks inside specified TREEs and collect their dependencies whether inside TREE or not. Then perform the specified actions on those tasks and their dependencies. Options: --parents list also ancestors of selected tasks outside selection; output will be in topological order. --help Show this message and exit. Now it is time to submit a workflow. In order to do that, we write:: askhl@erlkoenig:~/htw-workflow-bs$ htw-util workflow workflow.py Add: relax ready 7fabeab7 tree/relax-tf7xlviq askhl@erlkoenig:~/htw-workflow-bs$ htw-util ls relax ready 7fabeab7 tree/relax-tf7xlviq Note that the job is shown as *ready*, which signifies that is ready to be run, i.e. it is not yet submitted. The commands have created following file-structure. Useful linux command to view the structure of the calculations in the physical filesystem is called tree:: askhl@erlkoenig:~/htw-workflow-bs$ tree tree tree └── relax-tf7xlviq └── input.json 1 directory, 1 file To run the workflow, i.e. relax the geometrty of a molecule, it is suggested first to do a dry-run i.e. what would be submitted. To that end, there is the -z parameter. Run:: user@computer:~/htw-workflow-bs$ htw-util run tree/relax-tf7xlviq/ -z would run To actually submit the task, a following command needs to be executed:: user@computer:~/htw-workflow-bs$ htw-util run tree/relax-tf7xlviq/ We can now observe that the calculation is running (XXX):: user@computer:~/htw-workflow-bs$ htw-util ls relax xxx 7fabeab7 tree/relax-tf7xlviq After the task has finished, we can observe that is set to state done:: user@computer:~/htw-workflow-bs$ htw-util ls relax done 7fabeab7 tree/relax-tf7xlviq One may now see, that the task folder has been amended with a group of files, which are related to the execution of the task. These may be internal files, such in this case as there is GPAW output file, optimizer logs, optimization trajectory and thus not directly related to htw-util. However, the files related to htw-util are input.json, and output.json, which are the formal input and return value of the task which was executed:: user@computer:~/htw-workflow-bs$ tree tree/ tree/ └── relax-tf7xlviq ├── gpaw.txt ├── input.json ├── opt.log ├── opt.traj └── out.json 1 directory, 5 files Next step is to do a ground state calculation, based on the relaxed geometry. To do this, add a following line to the workflow.py:: rn.task('gs', atoms=relax.output, calculator=calculator) This tells the runner to define a task passing the output of the relaxed task under the name atoms. We need to also define a corresponding task:: def gs(atoms, calculator): calc = GPAW(**calculator) atoms.calc = calc atoms.get_potential_energy() gpw = Path('gs.gpw') calc.write(gpw) return gpw We make it return a `Path` object inside wherever the task runs. When other tasks run, they will see such paths relative to the own task's directory. This makes it possible to pass paths from one task to another. If we run the workflow again, we will get the followin tree:: askhl@erlkoenig:~/htw-workflow-bs$ tree tree tree ├── gs-gebijtk9 │ └── input.json └── relax-tf7xlviq ├── gpaw.txt ├── input.json ├── opt.log ├── opt.traj └── out.json 2 directories, 6 files askhl@erlkoenig:~/htw-workflow-bs$ Finally let us add a band structure task and run it:: def bs(gpw): gscalc = GPAW(gpw) atoms = gscalc.get_atoms() path = atoms.cell.bandpath(density=5) calc = gscalc.fixed_density(kpts=path.kpts, symmetry='off', txt='gpaw.txt') atoms.calc = calc bs = calc.band_structure() bs.write('bs.json') return bs This requires the followin line in the workflow:: rn.task('bs', gpw=gs.output) where `gs` is the object returned by `rn.task('gs', ...)`. Now execute the workflow and run the tasks. The band structure can be found in `bs.json` afterwards.