# Orthogonalizing the wave functions¶

Let $$\tilde{\Psi}_{nG}$$ be an element of a wave function matrix holding the value of $$\tilde{\psi}_{n}(\mathbf{r}_G)$$ (state number $$n$$ and grid point number $$G$$). Then we can write the orthogonality requirement like this:

$\Delta v \tilde{\mathbf{\Psi}}^T \hat{\mathbf{O}} \tilde{\mathbf{\Psi}} = \mathbf{1},$

where $$\Delta v$$ is the volume per grid point and

$\hat{\mathbf{O}} = \mathbf{1} + \sum_a \tilde{\mathbf{P}}^a \mathbf{\Delta O}^a (\tilde{\mathbf{P}}^a)^T \Delta v$

is the matrix form of the overlap operator. This matrix is very sparse because the projector functions $$\tilde{P}^a_{iG} = \tilde{p}^a_i(\mathbf{r}_G - \mathbf{R}^a)$$ are localized inside the augmentation spheres. The $$\Delta O^a_{i_1i_2}$$ atomic PAW overlap corrections are small $$N_p^a \times N_p^a$$ matrices ($$N_p^a \sim 10$$) defined here.

## Gram-Schmidt procedure¶

The traditional sequential Gram-Schmidt orthogonalization procedure is not very efficient, so we do some linear algebra to allow us to use efficient matrix-matrix products.

Let $$\tilde{\mathbf{\Psi}}_0$$ be the non-orthogonal wave functions. We calculate the overlap matrix:

$\mathbf{S} = \Delta v \tilde{\mathbf{\Psi}}_0^T \hat{\mathbf{O}} \tilde{\mathbf{\Psi}}_0,$

from the raw overlap $$\tilde{\mathbf{\Psi}}_0^T \tilde{\mathbf{\Psi}}_0$$ and the projections $$(\tilde{\mathbf{P}}^a)^T \tilde{\mathbf{\Psi}}_0$$.

This can be Cholesky factored into $$\mathbf{S} = \mathbf{L}^T \mathbf{L}$$ and we can get the orthogonalized wave functions as:

$\tilde{\mathbf{\Psi}} = \tilde{\mathbf{\Psi}}_0 \mathbf{L}^{-1}.$

## Parallelization¶

The orthogonalization can be paralleized over k-points, spins, domains, and bands.

### k-points and spins¶

Each k-point and each spin can be treated separately.

### Domains¶

Each domain will have its contribution to the overlap matrix, and these will have to be summed up using the domain communicator. The dense linear algebra can be performed in a replication fashion on all MPI tasks using LAPACK or in parallel on a subset of MPI tasks using ScaLAPACK.

### Bands¶

Band parallelization is described at Band parallelization.