================================================= Practical Tutorial: Building a Database of Solids ================================================= In this tutorial, you are a developer of ASR and wish to produce a production scale recipes to run a database. Creating the input structures ----------------------------- Create a file called makedb.py, which will supply: .. literalinclude:: ../../bs-workflow-big/makedb.py The first step is to create an input database:: python makedb.py Building the repository ----------------------- Second step is to setup a repository from the input database, i.e. import these structures into workflow. The first line sets up a repository, and after that we use an ASR tool called totree:: htw-util init asr database totree materials.db --run The totree command organizes the database rows into a directory tree and saves structure files. Each structure becomes a task, and we will next want to apply the previous workflow to each of those tasks. Thus, whereas previously the workflow did not take an input, our workflow will now take a structure task as an input. `htw-util ls` will show us that a lot of tasks exist now called `structure`. They are not so much tasks as just pieces of data. Setting up tasks ---------------- We use the same tasks as in the previous example: .. literalinclude:: ../../bs-workflow-big/tasks.py Creating a workflow ------------------- We adapt the workflow from previously, except we make it depend on a parameter with the same name as our “root” node, i.e., `structure`. When the workflow runs, the variable `structure` will be a future corresponding to the output of the `structure` tasks. The workflow will look like this: .. literalinclude:: ../../bs-workflow-big/workflow.py Since the workflow expects an input, we specify folders in the tree to run it:: htw-util workflow workflow.py tree/A/Au This applies the workflow to all tasks matching the name `structure` under the specified path or paths. The generated tasks can now be viewed and submitted as normal. To run the workflow can on *all* materials, specify the whole tree (e.g. `tree/`) to the workflow command. Submit the workflow ------------------- Set up myqueue and use `htw-util submit` or use `htw-util` to run them on the local machine. Something to think about ======================== * If a work flow script is updated, but the workflow is not run again, the tasks wont be updated, and there could be errors. * If a workflow is rerun many times, with different parameters, for example while developing the workflow, the tree directory gets one folder for each iteration. For production use, this is not an issue, because then the workflows are fixed. However, the tree directory will become unclear, due to caching of old invalidated things.