Calculations in parallel

Overview

Teaching: 25 min
Exercises: 20 min
Questions
  • Is it possible to run calculations in parallel using ASE?

  • How can I store useful results from a script?

  • How can I measure the speed-up achieved?

Objectives
  • Use the GPAW Python interpreter to run a DFT calculation across multiple cores and measure the speed-up achieved

  • Save the relevant data in a JSON file for further analysis

  • Use the Quantum Espresso code to run a DFT calculation across multiple cores

Code connection

In this episode we perform parallel calculations with GPAW and Quantum Espresso.

To accelerate our calculation we can parallelise the code over several cores

Parallel programming systems

There are several schemes for parallelising code. The two most common are MPI and OpenMP, with the optimum choice dependent on both the code and the hardware being used (for example, memory or CPU architecture, number of cores per node, network speed). Increasingly, electronic structure codes enable a hybrid of both approaches.

parprint and paropen are provided in ASE as an alternative to print and open

Do not run this code in your notebook!

The next code block should go into a new script file named “kpts_parallel.py”

from gpaw import GPAW, PW

import ase.build
from ase.parallel import parprint, world, paropen
from ase.utils.timing import Timer
import json

atoms = ase.build.bulk('Cu')

timer = Timer()
energies, times, nkpts = [], [], []

for k in range(3, 9):

    atoms.calc = GPAW(mode=PW(400), xc='PBE',
                      kpts={'size': [k, k, k],
                            'gamma': False},
                      txt='kpts_parallel.txt',
                      parallel={'kpt': True})
    timer.start(str(k))
    energy = atoms.get_potential_energy()
    timer.stop(str(k))
    energies.append(energy)
    times.append(timer.get_time(str(k)))
    nkpts.append(len(atoms.calc.get_ibz_k_points()))

with paropen('parallel_results.json', 'w') as file:
    json.dump({'energies': energies,
               'times': times,
               'nkpts': nkpts},
              file)
gpaw -P 4 python kpts_parallel.py
import json

with open('parallel_results.json', 'r') as file:
    parallel_data = json.load(file)

fig, axes = plt.subplots(nrows=3, sharex=True)

axes[0].plot(nkpts, energies, 'o-', label='serial')
axes[0].plot(parallel_data['nkpts'], parallel_data['energies'], 'o', label='parallel')
axes[0].set_ylabel('energy / eV')
axes[0].legend()

axes[1].plot(nkpts, times, 'o-', label='serial')
axes[1].plot(parallel_data['nkpts'], parallel_data['times'], 'o-', label='parallel')
axes[1].set_ylabel('Calculation time / s')
axes[1].legend()
axes[1].set_ylim([0, None])

axes[2].plot(nkpts, np.asarray(times) / parallel_data['times'], label='4 cores')
axes[2].set_ylabel('Speed-up factor')
axes[2].set_xlabel('number of k-points')

plot of energy convergence with respect to number of k-points, with parallelisation

Quantum Espresso can also be used for parallel programming with MPI

Getting the data

We will use the SSSP-efficiency pseudopotential set. To download these from a Jupyter Notebook run the following in a cell:

%%bash

mkdir SSSP_1.2.1_PBE_efficiency

wget -q https://archive.materialscloud.org/record/file?record_id=1680\&filename=SSSP_1.2.1_PBE_efficiency.tar.gz -O SSSP-efficiency.tar.gz

wget -q https://archive.materialscloud.org/record/file?filename=SSSP_1.2.1_PBE_efficiency.json\&record_id=1732 -O SSSP_1.2.1_PBE_efficiency.json

tar -zxvf SSSP-efficiency.tar.gz -C ./SSSP_1.2.1_PBE_efficiency

mv SSSP_1.2.1_PBE_efficiency.json ./SSSP_1.2.1_PBE_efficiency/

Note

Profiles are a fairly new ASE feature and not yet used by all such Calculators. An alternative way to manage these commands is by setting environment variables, e.g. ASE_ESPRESSO_COMMAND. Check the docs for each calculator to see what is currently implemented.

from ase.calculators.espresso import Espresso, EspressoProfile

profile = EspressoProfile(['mpirun', 'pw.x'])

Each Calculator has its own keywords to match the input syntax of the corresponding software code

pseudo_dir = Path.home() / 'SSSP-1.2.1_PBE_efficiency'

calc = Espresso(profile=profile,
                pseudo_dir=pseudo_dir,
                kpts=(3, 3, 3),
                input_data={'control':  {'tprnfor': True,
                                         'tstress': True},
                            'system': {'ecutwfc': 50.}},
                pseudopotentials={'Si': 'Si.pbe-n-rrkjus_psl.1.0.0.UPF'})

Once we have setup the calculator we use the same three step process to retrieve a property

atoms = ase.build.bulk('Si')
atoms.calc = calc
atoms.get_potential_energy()
-310.1328387367529
cat espresso.in

Exercise: Basis set convergence

As well as k-point sampling, basis-set convergence should be checked with respect to meaningful properties. Check the convergence of the atomisation energy of Si with respect to the Espresso parameter ecutwfc - the basis set cutoff energy in Ry. What cutoff energy is needed for a convergence level of 1 meV?

Hint: to calculate the atomisation energy, you will need to compare the energy of the solid to a single atom in a large cell.

Hint: you will need to calculate this property at several cutoff energies. Use Python functions and iteration constructs to avoid too much repetition.

Key Points

  • To accelerate our calculation we can parallelise the code over several cores

  • parprint and paropen are provided in ASE as an alternative to print and open

  • Quantum Espresso can also be used for parallel programming with MPI

  • Each Calculator has its own keywords to match the input syntax of the corresponding software code

  • Once we have setup the calculator we use the same three step process to retrieve a property