Genetic algorithm search for molecular crystal structures

Similar to the Genetic algorithm search for bulk crystal structures tutorial, this one illustrates how to search for stable crystal structures using the GA, but this time for a molecular crystal (solid dinitrogen, N2). For such materials, the global optimization problem can be strongly simplified by preserving molecular identity. By limiting the search to crystals consisting of N2 molecules, compared to individual N atoms (which may or may not form N2 molecules), the number of degrees of freedom is considerably reduced.


This assumption, of course, also needs to be checked – at high pressures, for example, nitrogen is known to polymerize and so will no longer consist of individual N2 molecules!

The same approach can also be used e.g. for compounds with polyatomic ions or for adsorbates on surfaces. The main differences with the ‘atomic’ GA therefore lie in the identity preservation during the various stages of the GA:

  • The random initial structures need to be constructed from the appropriate building blocks (here: N2 molecules with a N-N bond length as in the gas phase).

  • The genetic operators must leave the internal geometry of the polyatomic building blocks intact. In the present implementation, this is achieved by looking at the tags attribute of the parent Atoms objects: atoms belong to the same molecule if they have the same tag. These tags are either initialized by the initial structure generator, or can be added manually using the Atoms.set_tags() method.

  • The local optimization procedure may, if desired, change these internal geometries (i.e. the N-N bond lengths in our example). However, care may need to be taken that molecular identity is preserved also here (e.g. by applying certain constraints, see ase.constraints).

See also

The Constrained minima hopping (global optimization) tutorial addresses molecular identity in a different global optimization approach.

Initial population

As usual, a GA run starts with generating a series of random initial structures. The following script ( creates a gadb.db database containing 10 structures, each consisting of 8 N2 molecules.

from ase import Atoms
from import atomic_numbers
from import PrepareDB
from import StartGenerator
from import CellBounds, closest_distances_generator

# Number of randomly generated structures
N = 10

# The building blocks
blocks = [('N2', 8)]
# By writing 'N2', the generator will automatically
# get the N2 geometry using

# A guess for the cell volume in Angstrom^3
box_volume = 30. * 8

# The cell splitting scheme:
splits = {(2,): 1, (1,): 1}

# The minimal interatomic distances which the
# initial structures must satisfy. We can take these
# a bit larger than usual because these minimal
# distances will only be applied intermolecularly
# (and not intramolecularly):
Z = atomic_numbers['N']
blmin = closest_distances_generator(atom_numbers=[Z],

# The bounds for the randomly generated unit cells:
cellbounds = CellBounds(bounds={'phi': [30, 150], 'chi': [30, 150],
                                'psi': [30, 150], 'a': [3, 50],
                                'b': [3, 50], 'c': [3, 50]})

# The familiar 'slab' object, here only providing
# the PBC as there are no atoms or cell vectors
# that need to be applied.
slab = Atoms('', pbc=True)

# create the starting population
sg = StartGenerator(slab, blocks, blmin, box_volume=box_volume,
                    cellbounds=cellbounds, splits=splits,

# Initialize the database
da = PrepareDB(db_file_name='gadb.db', simulation_cell=slab,
               stoichiometry=[Z] * 16)

# Generate the new structures
# and add them to the database
for i in range(N):
    a = sg.get_new_candidate()