Defect

Defectstab

Summary

This benchmark evaluates formation energies across four distinct subsets covering different material systems: iron self-interstitial atoms (SIA), boron carbide stoichiometry, boron carbide point defects, and a methylammonium lead iodide vacancy. Results can be viewed per subset in the app. The subsets are described in detail below.

fe_sia

Formation energies of 5 single self-interstitial atom configurations in a 128-atom BCC iron supercell.

The formation energy is calculated as:

\[E_f = E_{\mathrm{config}} - \frac{N_{\mathrm{config}}}{N_{\mathrm{bulk}}} E_{\mathrm{bulk}}\]

where \(E_{\mathrm{config}}\) is the total energy of the interstitial configuration containing \(N_{\mathrm{config}}\) atoms, and \(E_{\mathrm{bulk}}\) is the energy of the perfect bulk supercell consisting of \(N_{\mathrm{bulk}}\) atoms.

  • DFT reference: PBE exchange-correlation functional.

boroncarbide_stoichiometry

Formation enthalpies of 6 boron carbide phases with varying stoichiometries (\(\mathrm{B_x C}\) with \(4 \leq x \leq 10.5\)).

The formation enthalpy is calculated as:

\[H_f = E_{\mathrm{phase}} - n_\mathrm{B} \, E_\mathrm{B} - n_\mathrm{C} \, E_\mathrm{C}\]

where \(n_\mathrm{B}\) and \(n_\mathrm{C}\) are the number of boron and carbon atoms in the structure, \(E_\mathrm{B} = E(\alpha \text{-} \mathrm{boron})/12\) and \(E_\mathrm{C} = E(\mathrm{graphite})/4\) are the per-atom reference energies of the elements.

  • DFT reference: LDA exchange-correlation functional.

boroncarbide_defects

Formation energies of 3 point defects in \(\mathrm{B_4C}\) boron carbide (boron-rich conditions): a bipolar defect (B/C exchange on a \(\mathrm{B_{11}C}\) icosahedron) and two variants of a chain boron vacancy (VB0 and VB0_CC).

The formation energies are:

  • Bipolar defect:

    \[E_f = E_{\mathrm{Defect}} - E_{\mathrm{NoDefects}}\]
  • Boron vacancy (B-rich):

    \[E_f = E_{\mathrm{Defect}} - E_{\mathrm{NoDefects}} + \mu_\mathrm{B}\]

    where \(\mu_\mathrm{B} = E_\mathrm{B}\) for boron-rich conditions, and \(\mu_\mathrm{B} = (E_{\mathrm{NoDefects}}/24 - E_\mathrm{C})/4\) for carbon-rich conditions. The current benchmark uses boron-rich conditions.

  • DFT reference: LDA exchange-correlation functional.

mapi_tetragonal

Formation energy of a methylammonium + iodine divacancy (\(\mathrm{VMAI}\)) in a 16-formula-unit supercell of tetragonal \(\mathrm{MAPbI_3}\).

The formation energy is calculated as:

\[E_f = E(\mathrm{VMAI}) + \frac{1}{2} E(\mathrm{MAI}) - E_{\mathrm{pristine}}\]

where \(E(\mathrm{VMAI})\) is the energy of the supercell with the divacancy, \(E(\mathrm{MAI})\) is the energy of the methylammonium iodide molecular crystal, and \(E_{\mathrm{pristine}}\) is the energy of the pristine supercell.

  • DFT reference: optB86b+vdW exchange-correlation functional.

Metrics

RMSD

Root Mean Square Deviation of formation energies compared to DFT data. The RMSD of formation energies with respect to DFT data is computed independently for each subset. Per-subset values are reported as separate columns in the app table.

For each subset, the bad threshold is set as:

\[\mathrm{bad} = 0.5 \times \mathrm{mean}\left(|E_\mathrm{ref}|\right)\]

where \(\mathrm{mean}\left(|E_\mathrm{ref}|\right)\) is the mean of the absolute DFT reference formation energies in that subset. The good threshold is 0 eV (perfect agreement).

Each per-subset RMSD is converted to a normalised score in \([0, 1]\) by linear interpolation between good (score = 1) and bad (score = 0). If the RMSD exceeds the bad threshold, the subset score is clamped to 0. The overall Score reported in the table is the unweighted average of all four per-subset scores.

Computational cost

Low: The geometries are static, requiring only single-point energy calculations for the configurations and reference structures.

Data availability

Input structures:

  • Subset fe of the Defectstab dataset (\(\mathrm{Fe}\) SIA configurations, PBE functional).

    • A. Allera, T.D. Swinburne, A.M. Goryaeva, B. Bienvenu, F. Ribeiro, M. Perez, M.-C. Marinica, D. Rodney, Activation entropy of dislocation glide in body-centered cubic metals from atomistic simulations, Nat Commun 16, 8367 (2025).

    • A.M. Goryaeva, C. Lapointe, C. Dai, J. Dérès, J.-B. Maillet, M.-C. Marinica, Reinforcing materials modelling by encoding the structures of defects in crystalline solids into distortion scores, Nat Commun 11, 4691 (2020).

  • Subset boroncarbide_stoichiometry and boroncarbide_defects of the Defectstab dataset (Boron Carbide structures, LDA functional).

    • G. Roma, K. Gillet, A. Jay, N. Vast, G. Gutierrez, Understanding first-order Raman spectra of boron carbides across the homogeneity range, Phys. Rev. Materials 5, 063601 (2021).

  • Subset mapi_tetragonal of the Defectstab dataset (MAPI tetragonal structures, optB86b+vdW functional).

    • K. Madaan, G. Roma, J. Gulomov, P. Pochet, C. Corbel, I. Makkonen, Challenges in predicting positron annihilation lifetimes in lead halide perovskites: correlation functionals and polymorphism, arXiv:2511.06926 (2025).

    • K. Madaan, Phases and vacancy defects in methylammonium lead iodide perovskite: an ab initio study, PhD thesis, Université Paris-Saclay (2023).

Reference data:

  • Computed from the DFT total energies provided with the input structures.

Relastab

Summary

This benchmark evaluates the ability of models to correctly identify the most stable interstitial configuration and to correctly rank the least stable ones. The evaluation is performed across multiple subsets representing different host systems (\(\mathrm{Fe}\) and \(\mathrm{CaWO_4}\)), and the final scores are averaged over these subsets.

  • DFT reference: PBE exchange-correlation functional.

Metrics

Two metrics are evaluated per subset; per-subset values are reported as separate columns in the app table.

GlobalMin

Binary score (1 if the predicted lowest-energy configuration matches the reference global minimum, 0 otherwise).

Top5 Spearman

Spearman rank correlation between predicted and reference rankings for the 5 highest-energy (least stable) configurations in the subset.

For both metrics the good threshold is 1.0 (perfect) and the bad threshold is 0.0. Each per-subset metric value is therefore used directly as its normalised score in \([0, 1]\). The overall Score reported in the table is the unweighted average of all per-subset metric scores across both metrics and both subsets.

Computational cost

Low: Requires single-point energy calculations for the configurations in each subset.

Data availability

Input structures:

  • Subset fe of the Relastab dataset (\(\mathrm{Fe}\) SIA configurations, PBE functional).

    • A. Allera, T.D. Swinburne, A.M. Goryaeva, B. Bienvenu, F. Ribeiro, M. Perez, M.-C. Marinica, D. Rodney, Activation entropy of dislocation glide in body-centered cubic metals from atomistic simulations, Nat Commun 16, 8367 (2025).

    • A.M. Goryaeva, C. Lapointe, C. Dai, J. Dérès, J.-B. Maillet, M.-C. Marinica, Reinforcing materials modelling by encoding the structures of defects in crystalline solids into distortion scores, Nat Commun 11, 4691 (2020).

  • Subset cawo4 of the Relastab dataset (\(\mathrm{CaWO_4}\) interstitial configurations, PBE functional).

    • G. Soum-Sidikov, A. Boisard, D. L010, M. Loidl, O. Stézowski, A. Music, G. Music, Y. Zeng, R. Cong, T. Yang, A. Echeverria, L. Music, D. Music, V. Music, Calculation of crystal defects induced in \(\mathrm{CaWO_4}\) cryogenic detector by 100 eV displacement cascades using a data driven force field, Phys. Rev. D 111, 085021 (2025).

Reference data:

  • Computed from the DFT total energies provided with the input structures.