Element filtering¶
ML-PEG provides the ability to filter each benchmark by elements. In its most basic form, this works by evaluating whether selected elements are required to carry out the entirity of a benchmark, and if they are, the benchmark is excluded from calculated scores.
In future, partial filtering will be supported, where individual data points can be excluded from calculation scores.
In either case, several checks and modifications should be made when adding a new benchmark, as described below.
Calculations¶
The key principle behind changes to calculations is to provide a more consistent output between models. Calculations should typically be designed to catch, rather than raise errors. For example:
for atoms in atoms_list:
try:
atoms.get_potential_energy()
except Exception as exc:
warn(f"Error calculating energy: {exc}", stacklevel=2)
atoms.info["energy"] = np.nan
Although this can also be handled during the analysis, catching errors in this way makes it simpler to compare outputs between models. For tests that investigate multiple structures and/or properties, this also ensures the calculation is not interupted, which will be important for future partial filtering.
This can be tested using the MockErrorCalculator:
ml_peg calc --test [test] --models-file ml_peg/models/mock_error.yml --no-run-mock
which will typically store results in
ml_peg/calcs/[category]/[test]/outputs/mock-error.
In order to provide a baseline, the MockCalculator can be used to return 0 for all
energies, forces, and stresses. This can be run similarly to above:
ml_peg calc --test [test] --models-file ml_peg/models/mock.yml
However, this calculator is also run by default when running a calculation, and can be
run without any other models by using the `--mock-only flag:
ml_peg calc --test [test] --mock-only
Running the MockCalculator can be skipped --no-run-mock when calling
ml_peg calc, as we shown above for the MockErrorCalculator.
In most cases, the results produced by MockCalculator will be to save elemental
information that the application needs for filtering, as well additional information,
such as labels for hoverdata.
Analysis¶
Analyses must be updated to ensure they save the elemental information required for
filtering. This typically involves using one of two utility functions. If only a single
list of elements needs to be extracted from the structure data, write_struct_info
can be used:
from ml_peg.app.utils.utils import write_struct_info
def test_benchmark(metrics):
write_struct_info(
data_path=CALC_PATH / "mock" / "struct.extxyz",
out_path=OUT_PATH,
index=0,
)
This function takes a structure file, or list of structure files, and saves the
combined set of elements found to OUT_PATH / "info.json".
While this all that is strictly required for basic filtering, in preparation for
partial filtering, it is encouraged to save finer-grained elemental information where
relevant. For example, if a benchmark iterates through many structures, a list of
elements should be saved for each. This can be done using get_struct_info:
from ml_peg.app.utils.utils import get_struct_info
SYSTEM_INFO = get_struct_info(
calc_path=CALC_PATH,
glob_pattern="*.xyz",
index="0",
include_filenames=True,
out_path=OUT_PATH,
info_keys=["sys_formula"],
)
Since tests of this form often require labels to be extracted for hoverdata, this
function also provides the option to save lists of keys saved to Atoms.info through
the
info_keysoption, as well as the filename stems usinginclude_filenames.
Instead of a single list of elements, this will save elements as a list of lists in
OUT_PATH / "info.json", as well as storing the other extracted informaiton in the
same file.
The main other change that may be required is to ensure analyses can handle missing files and invalid values when calculating metrics. In these cases, metric values should be set to NaN/None (in either case, they will be treated as a NaN when calculating scores).
Application¶
Changes to the application focus on loading the elemental information produced during analysis, and defining functions to use this information to filter the metrics.
Currently, relatively minimal changes are needed to enable filtering for each
application, as this is largely handled behind-the-scenes. When instaniating a test app,
info_path should be set to the path of the info.json that was produced during
analysis (typically, DATA_PATH / "info.json").
BaseApp, which each test app should inherit from, defines functions to load this
file. It then uses the "elements" key from the loaded info dictionary to determine
the full set of elements used by the test, and defines a function to filter the table
metrics based on a list of elements.
Partial filtering will require custom filter functions for each application, and is why we store elemental information per data point, rather than simply the combined set.