Metrics¶
- class deepdiagnostics.metrics.metric.Metric(model, data, run_id, out_dir=None, save=True, use_progress_bar=None, samples_per_inference=None, percentiles=None, number_simulations=None)¶
These parameters are used for every metric calculated, and for plots that require new inference to be run. Calculate a given metric. Save output to a json if out_dir and saving specified.
- Parameters:
model (deepdiagnostics.models.model) – Model to calculate the metric for. Required.
data (deepdiagnostics.data.data) – Data to test against. Required.
out_dir (Optional[str], optional) – Directory to save a json (results.json) to. Defaults to None.
save (bool, optional) – Save the output to json. Defaults to True.
use_progress_bar (Optional[bool], optional) – Show a progress bar when iteratively performing inference. Defaults to None.
samples_per_inference (Optional[int], optional) – Number of samples used in a single iteration of inference. Defaults to None.
percentiles (Optional[Sequence[int]], optional) – List of integer percentiles, for defining coverage regions. Defaults to None.
number_simulations (Optional[int], optional) – Number of different simulations to run. Often, this means that the number of inferences performed for a metric is samples_per_inference*number_simulations. Defaults to None.
run_id (str)
- class deepdiagnostics.metrics.AllSBC(model, data, run_id, out_dir=None, save=True, use_progress_bar=None, samples_per_inference=None, percentiles=None, number_simulations=None)¶
Calculate SBC diagnostics metrics and add them to the output. Adapted from [TCBD+20]. More information about specific metrics can be found here.
from deepdiagnostics.metrics import AllSBC metrics = AllSBC(model, data, save=False)() metrics = metrics.output
- calculate()¶
Calculate all SBC diagnostic metrics
- Returns:
Dictionary with all calculations, labeled by their name.
- Return type:
dict[str, Sequence]
- deepdiagnostics.metrics.LC2ST¶
alias of
LocalTwoSampleTest
- class deepdiagnostics.metrics.local_two_sample.LocalTwoSampleTest(model, data, run_id, out_dir=None, save=True, use_progress_bar=None, samples_per_inference=None, percentiles=None, number_simulations=None)¶
Note
A simulator is required to run this metric.
Adapted from [LGR23]. Train a classifier to verify the quality of the posterior via classifier accuracy. Produces an array of inference accuracies for the trained classier, representing the cases of either denying the null hypothesis (that the posterior output of the simulation is not significantly different from a given random sample.)
Code referenced from: github.com/JuliaLinhart/lc2st/lc2st.py::train_lc2st.
from deepdiagnostics.metrics import LC2ST true_probabilities, null_hypothesis_probabilities = LC2ST(model, data, save=False).calculate()
- calculate(linear_classifier='MLP', cross_evaluate=True, n_null_hypothesis_trials=100, classifier_kwargs=None)¶
Perform the calculation for the LC2ST. Adds the results to the lc2st.output (dict) under the parameters “lc2st_probabilities”, “lc2st_null_hypothesis_probabilities” as lists.
- Parameters:
linear_classifier (Union[str, list[str]], optional) – linear classifier to use for the test. Only MLP is implemented. Defaults to “MLP”.
cross_evaluate (bool, optional) – Use a k-fold’d dataset for evaluation. Defaults to True.
n_null_hypothesis_trials (int, optional) – Number of times to draw and test the null hypothesis. Defaults to 100.
classifier_kwargs (Union[dict, list[dict]], optional) – Additional kwargs for the linear classifier. Defaults to None.
- Returns:
arrays storing the true and null hypothesis probabilities given the linear classifier.
- Return type:
tuple[np.ndarray, np.ndarray]
- class deepdiagnostics.metrics.CoverageFraction(model, data, run_id, out_dir=None, save=True, use_progress_bar=None, samples_per_inference=None, percentiles=None, number_simulations=None)¶
Calculate the coverage of a set number of inferences over different confidence regions.
from deepdiagnostics.metrics import CoverageFraction samples, coverage = CoverageFraction(model, data, save=False).calculate()
- calculate()¶
Calculate the coverage fraction of the given model and data
- Returns:
A tuple of the samples tested (M samples, Samples per inference, N parameters) and the coverage over those samples.
- Return type:
tuple[Sequence, Sequence]
Pablo Lemos, Adam Coogan, Yashar Hezaveh, and Laurence Perreault-Levasseur. Sampling-based accuracy testing of posterior estimators for general inference. 2023. arXiv:2302.03026.
Julia Linhart, Alexandre Gramfort, and Pedro L. C. Rodrigues. L-c2st: local diagnostics for posterior approximations in simulation-based inference. 2023. arXiv:2306.03580.
Alvaro Tejero-Cantero, Jan Boelts, Michael Deistler, Jan-Matthis Lueckmann, Conor Durkan, Pedro J. Gonçalves, David S. Greenberg, and Jakob H. Macke. Sbi: a toolkit for simulation-based inference. Journal of Open Source Software, 5(52):2505, 2020. URL: https://doi.org/10.21105/joss.02505, doi:10.21105/joss.02505.