# Orthogonalizing the wave functions¶

Let \(\tilde{\Psi}_{nG}\) be an element of a wave function matrix holding the value of \(\tilde{\psi}_{n}(\mathbf{r}_G)\) (state number \(n\) and grid point number \(G\)). Then we can write the orthogonality requirement like this:

where \(\Delta v\) is the volume per grid point and

is the matrix form of the overlap operator. This matrix is very sparse because the projector functions \(\tilde{P}^a_{iG} = \tilde{p}^a_i(\mathbf{r}_G - \mathbf{R}^a)\) are localized inside the augmentation spheres. The \(\Delta O^a_{i_1i_2}\) atomic PAW overlap corrections are small \(N_p^a \times N_p^a\) matrices (\(N_p^a \sim 10\)) defined here.

## Gram-Schmidt procedure¶

The traditional sequential Gram-Schmidt orthogonalization procedure is not very efficient, so we do some linear algebra to allow us to use efficient matrix-matrix products.

Let \(\tilde{\mathbf{\Psi}}_0\) be the non-orthogonal wave functions. We calculate the overlap matrix:

from the raw overlap \(\tilde{\mathbf{\Psi}}_0^T \tilde{\mathbf{\Psi}}_0\) and the projections \((\tilde{\mathbf{P}}^a)^T \tilde{\mathbf{\Psi}}_0\).

This can be Cholesky factored into \(\mathbf{S} = \mathbf{L}^T \mathbf{L}\) and we can get the orthogonalized wave functions as:

## Parallelization¶

The orthogonalization can be paralleized over **k**-points, spins,
domains, and bands.

**k**-points and spins¶

Each **k**-point and each spin can be treated separately.

### Domains¶

Each domain will have its contribution to the overlap matrix, and these will have to be summed up using the domain communicator. The dense linear algebra can be performed in a replication fashion on all MPI tasks using LAPACK or in parallel on a subset of MPI tasks using ScaLAPACK.

### Bands¶

Band parallelization is described at Band parallelization.