Genetic algorithm search for bulk crystal structures

Here we will use the GA to predict a crystal structure of a given chemical composition.

The implementation is based on:

M. Van den Bossche, H. Grönbeck, and B. Hammer
J. Chem. Theory Comput. 2018, 14, 2797−2807

and has much the same functionality as e.g. the USPEX program from the Oganov group and the XtalOpt code by the Zurek group.

What sets crystal structure searches apart from other global optimization problems, is that typically the cell vectors need to be treated as additional degrees of freedom (in addition to the atomic coordinates). This results in the following ‘bulk GA’-specific functionality:

  • Generating random structures now also involves randomized unit cell vectors.

  • Cut-and-splice pairing needs to be done using scaled coordinates and includes a random combination of the two parent unit cells.

  • A ‘strain’ mutation applies a random deformation of the parent unit cell, similar to how the ‘rattle’ mutation acts on the atomic coordinates.

  • In a ‘permustrain’ mutation, atoms from different elements are swapped in addition to the ‘strain’ mutation.

  • A ‘soft’ mutation displaces the atoms along a low-frequency vibrational mode obtained via e.g. a simple harmonic bond model.

As an example, we will search for the most energetically stable Ag24 polymorphs using an EMT potential. Note that, in general, the number of atoms per unit cell should be chosen carefully, as one will only find crystal structures where the stoichiometry of the primitive cell is a divisor of the chosen stoichiometry.

Initial population

The following script ( creates a gadb.db database containing 20 randomly generated initial structures.

from ase import Atoms
from import atomic_numbers
from import PrepareDB
from import StartGenerator
from import CellBounds, closest_distances_generator

# Number of random initial structures to generate
N = 20

# Target cell volume for the initial structures, in angstrom^3
volume = 240.

# Specify the 'building blocks' from which the initial structures
# will be constructed. Here we take single Ag atoms as building
# blocks, 24 in total.
blocks = [('Ag', 24)]
# We may also write:
blocks = ['Ag'] * 24

# Generate a dictionary with the closest allowed interatomic distances
Z = atomic_numbers['Ag']
blmin = closest_distances_generator(atom_numbers=[Z],

# Specify reasonable bounds on the minimal and maximal
# cell vector lengths (in angstrom) and angles (in degrees)
cellbounds = CellBounds(bounds={'phi': [35, 145], 'chi': [35, 145],
                                'psi': [35, 145], 'a': [3, 50],
                                'b': [3, 50], 'c': [3, 50]})

# Choose an (optional) 'cell splitting' scheme which basically
# controls the level of translational symmetry (within the unit
# cell) of the randomly generated structures. Here a 1:1 ratio
# of splitting factors 2 and 1 is used:
splits = {(2,): 1, (1,): 1}
# There will hence be a 50% probability that a candidate
# is constructed by repeating an randomly generated Ag12
# structure along a randomly chosen axis. In the other 50%
# of cases, no cell cell splitting will be applied.

# The 'slab' object in the GA serves as a template
# in the creation of new structures, which inherit
# the slab's atomic positions (if any), cell vectors
# (if specified), and periodic boundary conditions.
# Here only the last property is relevant:
slab = Atoms('', pbc=True)

# Initialize the random structure generator
sg = StartGenerator(slab, blocks, blmin, box_volume=volume,
                    cellbounds=cellbounds, splits=splits)

# Create the database
da = PrepareDB(db_file_name='gadb.db',
               stoichiometry=[Z] * 24)

# Generate N random structures
# and add them to the database
for i in range(N):
    a = sg.get_new_candidate()