# Machine-learning methods for structure prediction of multi-component perovskites

### Subproject P09

The connection between the composition and function of complex multi-component oxides is intricate, and our knowledge about it is extremely limited. Current models can at most predict the stability of a stoichiometric composition, a very general structural feature. P09 will develop accelerated ML models to predict the structural details that determine the functionality of perovskites. We will implement two approaches:

First, EAs will be combined with an NN potential trained on the fly to quickly explore the energy landscape of perovskite surfaces and predict their detailed structures. In collaboration with experimental partners (P02 Diebold, P04 Parkinson), those structures will be falsified by direct comparison with diffraction data on existing surfaces. Additionally, the implementation, inputs, and results of the machine-learned force fields (MLFFs) will be shared with the theoretical partners for cross-validation.

Second, GANs will be trained on known compositions to identify the key features of real perovskite structures and propose new stable ones.

## Expertise

We develop and apply atomistic models for theoretical chemistry and materials science. Our expertise covers both classical and quantum methods, as well as multiscale calculations and machine-learning techniques. The group has taken part in the development and public release of a range of packages for atomistic calculations, including:

- WIEN2k, a popular all-electron density functional theory implementation;
- BoltzTraP and BoltzTraP2, two packages used to interpolate electronic band structures and calculate transport coefficients;
- ShengBTE, the first open-source solver of the Boltzmann transport for phonons, which enables predictive calculations of the thermal conductivity of nanostructures;
- almaBTE, a software package for multiscale thermal transport simulation based on first principles;
- Clinamen, an implementation of the covariance matrix adaptation evolutionary algorithm that helps explore complex energy landscapes.

These are some of the methods we have used to study solids, liquids, surfaces, and nanostructures:

- Density functional theory (DFT);
- Classical and ab-initio molecular dynamics (MD);
- Self-consistent anharmonic free energy calculations;
- The Boltzmann transport equation (BTE);
- Traditional and particle-filter Monte Carlo (MC);
- Covariance matrix adaptation evolutionary algorithm (CMA-ES);
- Classification and regression random forests based on phenomenological information;
- Algorithmically differentiable machine-learning (ML) force fields based on JAX;
- High-throughput (HT) materials screening.

## Team

#### Associates

### 2024

Wanzenböck, Ralf; Buchner, Florian; Kovács, Péter; Madsen, Georg K. H.; Carrete, Jesús

Clinamen2: Functional-style evolutionary optimization in Python for atomistic structure searches

Journal ArticleForthcomingOpen AccessIn: Computer Physics Communications, vol. 297, no. 109065, Forthcoming.

Abstract | Links | BibTeX | Tags: P09

@article{wanzenboeck2024,

title = {Clinamen2: Functional-style evolutionary optimization in Python for atomistic structure searches},

author = {Ralf Wanzenböck and Florian Buchner and Péter Kovács and Georg K. H. Madsen and Jesús Carrete},

doi = {10.1016/j.cpc.2023.109065},

year = {2024},

date = {2024-04-30},

urldate = {2024-04-01},

journal = {Computer Physics Communications},

volume = {297},

number = {109065},

abstract = {Clinamen2 is a versatile functional-style Python implementation of the covariance matrix adaptation evolution strategy (CMA-ES) utilizing Cholesky decomposition. On top of a problem-agnostic core algorithm, the software package offers a suite of utilities and library code enabling applications to important atomistic structure searches. Features include massively distributed computation and the BI-Population restart scheme. This article details the general code structure and introduces examples that illustrate some relevant applications for the materials science and chemistry worlds, including interfacing to density-functional-theory codes and machine-learned surrogate models. The functional design renders the code modular and adaptable, and makes the creation of interfaces to other atomistic software straightforward.},

keywords = {P09},

pubstate = {forthcoming},

tppubtype = {article}

}

### 2023

Carrete, Jesús; Montes-Campos, Hadrián; Wanzenböck, Ralf; Heid, Esther; Madsen, Georg K. H.

Journal ArticleOpen AccessIn: The Journal of Chemical Physics, vol. 158, no. 20, pp. 204801-1–204801-18, 2023.

Abstract | Links | BibTeX | Tags: P09

@article{Carrete2023,

title = {Deep ensembles vs committees for uncertainty estimation in neural-network force fields: Comparison and application to active learning},

author = {Jesús Carrete and Hadrián Montes-Campos and Ralf Wanzenböck and Esther Heid and Georg K. H. Madsen},

doi = {10.1063/5.0146905},

year = {2023},

date = {2023-05-22},

journal = {The Journal of Chemical Physics},

volume = {158},

number = {20},

pages = {204801-1--204801-18},

publisher = {AIP Publishing},

abstract = {A reliable uncertainty estimator is a key ingredient in the successful use of machine-learning force fields for predictive calculations. Important considerations are correlation with error, overhead during training and inference, and efficient workflows to systematically improve the force field. However, in the case of neural-network force fields, simple committees are often the only option considered due to their easy implementation. Here, we present a generalization of the deep-ensemble design based on multiheaded neural networks and a heteroscedastic loss. It can efficiently deal with uncertainties in both energy and forces and take sources of aleatoric uncertainty affecting the training data into account. We compare uncertainty metrics based on deep ensembles, committees, and bootstrap-aggregation ensembles using data for an ionic liquid and a perovskite surface. We demonstrate an adversarial approach to active learning to efficiently and progressively refine the force fields. That active learning workflow is realistically possible thanks to exceptionally fast training enabled by residual learning and a nonlinear learned optimizer.},

keywords = {P09},

pubstate = {published},

tppubtype = {article}

}

Bichelmaier, Sebastian; Carrete, Jesús; Wanzenböck, Ralf; Buchner, Florian; Madsen, Georg K. H.

Neural-network-backed effective harmonic potential study of the ambient pressure phases of hafnia

Journal ArticleIn: Physical Review B, vol. 107, no. 184111, 2023, ISSN: 2469-9969.

Abstract | Links | BibTeX | Tags: P09

@article{Bichelmaier2023,

title = {Neural-network-backed effective harmonic potential study of the ambient pressure phases of hafnia},

author = {Sebastian Bichelmaier and Jesús Carrete and Ralf Wanzenböck and Florian Buchner and Georg K. H. Madsen},

doi = {10.1103/physrevb.107.184111},

issn = {2469-9969},

year = {2023},

date = {2023-05-17},

urldate = {2023-05-17},

journal = {Physical Review B},

volume = {107},

number = {184111},

publisher = {American Physical Society (APS)},

abstract = {Phonon-based approaches and molecular dynamics are widely established methods for gaining access to a temperature-dependent description of material properties. However, when a compound's phase space is vast, density-functional-theory-backed studies quickly reach prohibitive levels of computational expense. Here, we explore the complex phase structure of HfO_{2} using effective harmonic potentials based on a neural-network force field (NNFF) as a surrogate model. We detail the data acquisition and training strategy that enable the NNFF to provide almost ab-initio accuracy at a significantly reduced cost and present a recipe for automation. We demonstrate how the NNFF can generalize beyond its training data and that it is transferable between several phases of hafnia. We find that the thermal expansion coefficient of the low-symmetry phases agrees well with experimental results, and we determine the P¯43m phase to be the favorable (stoichiometric) cubic phase over the established Fm¯3m. In contrast, the experimental lattice constants of the cubic phases are substantially larger than what is calculated for the corresponding stoichiometric phases. Furthermore, we show that the stoichiometric cubic phases are unlikely to be thermodynamically stable compared to the tetragonal and monoclinic phases and hypothesize that they exist only in defect-stabilized forms.},

keywords = {P09},

pubstate = {published},

tppubtype = {article}

}

_{2}using effective harmonic potentials based on a neural-network force field (NNFF) as a surrogate model. We detail the data acquisition and training strategy that enable the NNFF to provide almost ab-initio accuracy at a significantly reduced cost and present a recipe for automation. We demonstrate how the NNFF can generalize beyond its training data and that it is transferable between several phases of hafnia. We find that the thermal expansion coefficient of the low-symmetry phases agrees well with experimental results, and we determine the P¯43m phase to be the favorable (stoichiometric) cubic phase over the established Fm¯3m. In contrast, the experimental lattice constants of the cubic phases are substantially larger than what is calculated for the corresponding stoichiometric phases. Furthermore, we show that the stoichiometric cubic phases are unlikely to be thermodynamically stable compared to the tetragonal and monoclinic phases and hypothesize that they exist only in defect-stabilized forms.

### 2022

Wanzenböck, Ralf; Arrigoni, Marco; Bichelmaier, Sebastian; Buchner, Florian; Carrete, Jesús; Madsen, Georg K. H.

Neural-network-backed evolutionary search for SrTiO_{3}(110) surface reconstructions

In: Digital Discovery, vol. 1, no. 5, pp. 703–710, 2022.

Abstract | Links | BibTeX | Tags: P09

@article{Wanzenboeck2022,

title = {Neural-network-backed evolutionary search for SrTiO_{3}(110) surface reconstructions},

author = {Ralf Wanzenböck and Marco Arrigoni and Sebastian Bichelmaier and Florian Buchner and Jesús Carrete and Georg K. H. Madsen},

doi = {10.1039/d2dd00072e},

year = {2022},

date = {2022-08-26},

journal = {Digital Discovery},

volume = {1},

number = {5},

pages = {703--710},

publisher = {Royal Society of Chemistry (RSC)},

abstract = {The determination of atomic structures in surface reconstructions has typically relied on structural models derived from intuition and domain knowledge. Evolutionary algorithms have emerged as powerful tools for such structure searches. However, when density functional theory is used to evaluate the energy the computational cost of a thorough exploration of the potential energy landscape is prohibitive. Here, we drive the exploration of the rich phase diagram of TiO_{x} overlayer structures on SrTiO_{3}(110) by combining the covariance matrix adaptation evolution strategy (CMA-ES) and a neural-network force field (NNFF) as a surrogate energy model. By training solely on SrTiO_{3}(110) 4×1 overlayer structures and performing CMA-ES runs on 3×1, 4×1 and 5×1 overlayers, we verify the transferability of the NNFF. The speedup due to the surrogate model allows taking advantage of the stochastic nature of the CMA-ES to perform exhaustive sets of explorations and identify both known and new low-energy reconstructions.},

keywords = {P09},

pubstate = {published},

tppubtype = {article}

}

_{x}overlayer structures on SrTiO

_{3}(110) by combining the covariance matrix adaptation evolution strategy (CMA-ES) and a neural-network force field (NNFF) as a surrogate energy model. By training solely on SrTiO

_{3}(110) 4×1 overlayer structures and performing CMA-ES runs on 3×1, 4×1 and 5×1 overlayers, we verify the transferability of the NNFF. The speedup due to the surrogate model allows taking advantage of the stochastic nature of the CMA-ES to perform exhaustive sets of explorations and identify both known and new low-energy reconstructions.

Montes-Campos, Hadrián; Carrete, Jesús; Bichelmaier, Sebastian; Varela, Luis M; Madsen, Georg K. H.

A Differentiable Neural-Network Force Field for Ionic Liquids

Journal ArticleOpen AccessIn: Journal of Chemical Information and Modeling, vol. 62, no. 1, pp. 88–101, 2022.

Abstract | Links | BibTeX | Tags: P09

@article{MontesCampos2021,

title = {A Differentiable Neural-Network Force Field for Ionic Liquids},

author = {Hadrián Montes-Campos and Jesús Carrete and Sebastian Bichelmaier and Luis M Varela and Georg K. H. Madsen},

doi = {10.1021/acs.jcim.1c01380},

year = {2022},

date = {2022-01-03},

urldate = {2022-01-03},

journal = {Journal of Chemical Information and Modeling},

volume = {62},

number = {1},

pages = {88--101},

abstract = {We present NeuralIL, a model for the potential energy of an ionic liquid that accurately reproduces first-principles results with orders-of-magnitude savings in computational cost. Based on a multilayer perceptron and spherical Bessel descriptors of the atomic environments, NeuralIL is implemented in such a way as to be fully automatically differentiable. It can thus be trained on ab-initio forces instead of just energies, to make the most out of the available data, and can efficiently predict arbitrary derivatives of the potential energy. We parametrize the model for the case of ethylammonium nitrate. We discuss the best way to include chemical information in the atom-centered descriptors for a many-component system. Furthermore, we demonstrate an ensemble-learning approach to the detection of extrapolation. With out-of-sample accuracies better than 0.1 kcal/mol in the energies and 100 meV/Å in the forces, our potential model considerably outperforms molecular-mechanics force fields and opens the door to large-scale thermodynamical calculations with ab-initio-like accuracy for ionic liquids. Including the forces does away with the idea that vast amounts of atomic configurations are required to train a neural network force field based on atom-centered descriptors. We also find that a separate treatment of long-range interactions is not required to achieve a high-quality representation of the potential

energy surface of these dense ionic systems.},

keywords = {P09},

pubstate = {published},

tppubtype = {article}

}

energy surface of these dense ionic systems.

### 2021

Arrigoni, Marco; Madsen, Georg K. H.

Evolutionary computing and machine learning for discovering of low-energy defect configurations

Journal ArticleOpen AccessIn: npj Computational Materials, vol. 7, no. 1, 2021.

Abstract | Links | BibTeX | Tags: P09, pre-TACO

@article{Arrigoni2021,

title = {Evolutionary computing and machine learning for discovering of low-energy defect configurations},

author = {Marco Arrigoni and Georg K. H. Madsen},

doi = {10.1038/s41524-021-00537-1},

year = {2021},

date = {2021-05-20},

urldate = {2021-05-20},

journal = {npj Computational Materials},

volume = {7},

number = {1},

publisher = {Springer Science and Business Media LLC},

abstract = {Density functional theory (DFT) has become a standard tool for the study of point defects in materials. However, finding the most stable defective structures remains a very challenging task as it involves the solution of a multimodal optimization problem with a high-dimensional objective function. Hitherto, the approaches most commonly used to tackle this problem have been mostly empirical, heuristic, and/or based on domain knowledge. In this contribution, we describe an approach for exploring the potential energy surface (PES) based on the covariance matrix adaptation evolution strategy (CMA-ES) and supervised and unsupervised machine learning models. The resulting algorithm depends only on a limited set of physically interpretable hyperparameters and the approach offers a systematic way for finding low-energy configurations of isolated point defects in solids. We demonstrate its applicability on different systems and show its ability to find known low-energy structures and discover additional ones as well.},

keywords = {P09, pre-TACO},

pubstate = {published},

tppubtype = {article}

}

### 2020

van Roekeghem, Ambroise; Carrete, Jesús; Curtarolo, Stefano; Mingo, Natalio

Journal ArticleIn: Physical Review Materials, vol. 4, no. 11, pp. 113804, 2020.

Abstract | Links | BibTeX | Tags: P09, pre-TACO

@article{Roekeghem2020,

title = {High-throughput study of the static dielectric constant at high temperatures in oxide and fluoride cubic perovskites},

author = {Ambroise van Roekeghem and Jesús Carrete and Stefano Curtarolo and Natalio Mingo},

doi = {10.1103/physrevmaterials.4.113804},

year = {2020},

date = {2020-11-13},

journal = {Physical Review Materials},

volume = {4},

number = {11},

pages = {113804},

publisher = {American Physical Society (APS)},

abstract = {Using finite-temperature phonon calculations and the Lyddane-Sachs-Teller relations, we calculate ab initio the static dielectric constants of 78 semiconducting oxides and fluorides with cubic perovskite structures at 1000 K. We first compare our method with experimental measurements, and we find that it succeeds in describing the temperature dependence and the relative ordering of the static dielectric constant ε_{DC} in the series of oxides BaTiO_{3}, SrTiO_{3}, KTaO_{3}. We show that the effects of anharmonicity on the ion-clamped dielectric constant, on Born charges, and on phonon lifetimes, can be neglected in the framework of our high-throughput study. Based on the high-temperature phonon spectra, we find that the dispersion of ε_{DC} is one order of magnitude larger among oxides than fluorides at 1000 K. We display the correlograms of the dielectric constants with simple structural descriptors, and we point out that ε_{DC} is actually well correlated with the infinite-frequency dielectric constant ε_{∞}, even in those materials with phase transitions in which ε_{DC} is strongly temperature dependent.},

keywords = {P09, pre-TACO},

pubstate = {published},

tppubtype = {article}

}

_{DC}in the series of oxides BaTiO

_{3}, SrTiO

_{3}, KTaO

_{3}. We show that the effects of anharmonicity on the ion-clamped dielectric constant, on Born charges, and on phonon lifetimes, can be neglected in the framework of our high-throughput study. Based on the high-temperature phonon spectra, we find that the dispersion of ε

_{DC}is one order of magnitude larger among oxides than fluorides at 1000 K. We display the correlograms of the dielectric constants with simple structural descriptors, and we point out that ε

_{DC}is actually well correlated with the infinite-frequency dielectric constant ε

_{∞}, even in those materials with phase transitions in which ε

_{DC}is strongly temperature dependent.

### 2017

Legrain, Fleur; Carrete, Jesús; van Roekeghem, Ambroise; Curtarolo, Stefano; Mingo, Natalio

How Chemical Composition Alone Can Predict Vibrational Free Energies and Entropies of Solids

Journal ArticleIn: Chemistry of Materials, vol. 29, no. 15, pp. 6220–6227, 2017.

Abstract | Links | BibTeX | Tags: P09, pre-TACO

@article{Legrain2017,

title = {How Chemical Composition Alone Can Predict Vibrational Free Energies and Entropies of Solids},

author = {Fleur Legrain and Jesús Carrete and Ambroise van Roekeghem and Stefano Curtarolo and Natalio Mingo},

doi = {10.1021/acs.chemmater.7b00789},

year = {2017},

date = {2017-06-22},

journal = {Chemistry of Materials},

volume = {29},

number = {15},

pages = {6220--6227},

publisher = {American Chemical Society (ACS)},

abstract = {Computing vibrational free energies (F_{vib}) and entropies (S_{vib}) has posed a long-standing challenge to the high-throughput ab initio investigation of finite temperature properties of solids. Here, we use machine-learning techniques to efficiently predict F_{vib} and S_{vib} of crystalline compounds in the Inorganic Crystal Structure Database. Using descriptors based simply on the chemical formula and using a training set of only 300 compounds, mean absolute errors of less than 0.04 meV/K/atom (15 meV/atom) are achieved for S_{vib} (F_{vib}), whose values are distributed within a range of 0.9 meV/K/atom (300 meV/atom.) In addition, for training sets containing fewer than 2000 compounds, the chemical formula alone is shown to perform as well as, if not better than, four other more complex descriptors previously used in the literature. The accuracy and simplicity of the approach means that it can be advantageously used for fast screening of chemical reactions at finite temperatures.},

keywords = {P09, pre-TACO},

pubstate = {published},

tppubtype = {article}

}

_{vib}) and entropies (S

_{vib}) has posed a long-standing challenge to the high-throughput ab initio investigation of finite temperature properties of solids. Here, we use machine-learning techniques to efficiently predict F

_{vib}and S

_{vib}of crystalline compounds in the Inorganic Crystal Structure Database. Using descriptors based simply on the chemical formula and using a training set of only 300 compounds, mean absolute errors of less than 0.04 meV/K/atom (15 meV/atom) are achieved for S

_{vib}(F

_{vib}), whose values are distributed within a range of 0.9 meV/K/atom (300 meV/atom.) In addition, for training sets containing fewer than 2000 compounds, the chemical formula alone is shown to perform as well as, if not better than, four other more complex descriptors previously used in the literature. The accuracy and simplicity of the approach means that it can be advantageously used for fast screening of chemical reactions at finite temperatures.

### 2016

van Roekeghem, Ambroise; Carrete, Jesús; Oses, Corey; Curtarolo, Stefano; Mingo, Natalio

Journal ArticleOpen AccessIn: Physical Review X, vol. 6, no. 4, pp. 041061, 2016.

Abstract | Links | BibTeX | Tags: P09, pre-TACO

@article{Roekeghem2016,

title = {High-Throughput Computation of Thermal Conductivity of High-Temperature Solid Phases: The Case of Oxide and Fluoride Perovskites},

author = {Ambroise van Roekeghem and Jesús Carrete and Corey Oses and Stefano Curtarolo and Natalio Mingo},

doi = {10.1103/physrevx.6.041061},

year = {2016},

date = {2016-06-13},

urldate = {2016-06-13},

journal = {Physical Review X},

volume = {6},

number = {4},

pages = {041061},

publisher = {American Physical Society (APS)},

abstract = {Using finite-temperature phonon calculations and machine-learning methods, we assess the mechanical stability of about 400 semiconducting oxides and fluorides with cubic perovskite structures at 0, 300, and 1000 K. We find 92 mechanically stable compounds at high temperatures—including 36 not mentioned in the literature so far—for which we calculate the thermal conductivity. We show that the thermal conductivity is generally smaller in fluorides than in oxides, largely due to a lower ionic charge, and describe simple structural descriptors that are correlated with its magnitude. Furthermore, we show that the thermal conductivities of most cubic perovskites decrease more slowly than the usual T^{−1} behavior. Within this set, we also screen for materials exhibiting negative thermal expansion. Finally, we describe a strategy to accelerate the discovery of mechanically stable compounds at high temperatures.},

keywords = {P09, pre-TACO},

pubstate = {published},

tppubtype = {article}

}

^{−1}behavior. Within this set, we also screen for materials exhibiting negative thermal expansion. Finally, we describe a strategy to accelerate the discovery of mechanically stable compounds at high temperatures.