Skip to content

Experiment Aggregators

SingletonAggregator

Bases: ExperimentAggregator

An aggregation to apply to an ExperimentGroup that needs no aggregation.

For example, the ExperimentGroup only contains one Experiment.

Essentially just the identity function:

\[f(x)=x\]

Attributes

name property

The name of the experiment aggregation method.

Functions

__call__

Aggregates a series of experiment results from a specific ExperimentGroup and Metric.

BetaAggregator

Bases: ExperimentAggregator

Samples from the beta-conflated distribution.

Specifically, the aggregate distribution \(\text{Beta}(\tilde{\alpha}, \tilde{\beta})\) is estimated as:

\[\begin{aligned} \tilde{\alpha}&=\left[\sum_{i=1}^{M}\alpha_{i}\right]-\left(M-1\right) \ \tilde{\beta}&=\left[\sum_{i=1}^{M}\beta_{i}\right]-\left(M-1\right) \end{aligned}\]

where \(M\) is the total number of experiments.

Uses scipy.stats.beta class to fit beta-distributions.

Assumptions:
  • the individual experiment distributions are beta distributed
  • the metrics are bounded, although the range need not be (0, 1)
Read more:
  1. Hill, T. P. (2008). Conflations Of Probability Distributions: An Optimal Method For Consolidating Data From Different Experiments.
  2. Hill, T. P., & Miller, J. (2011). How to combine independent data sets for the same quantity.
  3. 'Beta distribution' on Wikipedia

Parameters:

  • estimation_method (str, default: 'mle' ) –

    method for estimating the parameters of the individual experiment distributions. Options are 'mle' for maximum-likelihood estimation, or 'mome' for the method of moments estimator. MLE tends be more efficient but is difficult to estimate

Attributes

name property

The name of the experiment aggregation method.

Functions

__call__

Aggregates a series of experiment results from a specific ExperimentGroup and Metric.

GammaAggregator

Bases: ExperimentAggregator

Samples from the Gamma-conflated distribution.

Specifically, the aggregate distribution \(\text{Gamma}(\tilde{\alpha}, \tilde{\beta})\) (\(\alpha\) is the shape, \(\beta\) the rate parameter) is estimated as:

\[\begin{aligned} \tilde{\alpha}&=\left[\sum_{i}^{M}\alpha_{i}\right]-(M-1) \ \tilde{\beta}&=\dfrac{1}{\sum_{i}^{M}\beta_{i}^{-1}} \end{aligned}\]

where \(M\) is the total number of experiments.

An optional shifted: bool argument exists to dynamically estimate the support for the distribution. Can help fit to individual experiments, but likely minimally impacts the aggregate distribution.

Assumptions:
  • the individual experiment distributions are gamma distributed
Read more:
  1. Hill, T. (2008). Conflations Of Probability Distributions: An Optimal Method For Consolidating Data From Different Experiments.
  2. Hill, T., & Miller, J. (2011). How to combine independent data sets for the same quantity.
  3. 'Gamma distribution' on Wikipedia

Attributes

name property

The name of the experiment aggregation method.

Functions

__call__

Aggregates a series of experiment results from a specific ExperimentGroup and Metric.

FEGaussianAggregator

Bases: ExperimentAggregator

Samples from the Gaussian-conflated distribution.

This is equivalent to the fixed-effects meta-analytical estimator.

Uses the inverse variance weighted mean and standard errors. Specifically, the aggregate distribution \(\mathcal{N}(\tilde{\mu}, \tilde{\sigma})\) is estimated as:

\[\begin{aligned} w_{i}&=\dfrac{\sigma_{i}^{-2}}{\sum_{j}^{M}\sigma_{j}^{-2}} \\ \tilde{\mu}&=\sum_{i}^{M}w_{i}\mu_{i} \\ \tilde{\sigma}^2&=\dfrac{1}{\sum_{i}^{M}\sigma_{i}^{-2}} \end{aligned}\]

where \(M\) is the total number of experiments.

Assumptions:
  • the individual experiment distributions are normally (Gaussian) distributed
  • there is no inter-experiment heterogeneity present
Read more:
  1. Hill, T. (2008). Conflations Of Probability Distributions: An Optimal Method For Consolidating Data From Different Experiments.
  2. Hill, T., & Miller, J. (2011). How to combine independent data sets for the same quantity.
  3. Higgins, J., & Thomas, J. (Eds.). (2023). Cochrane handbook for systematic reviews of interventions.
  4. Borenstein et al. (2021). Introduction to meta-analysis.
  5. 'Meta-analysis' on Wikipedia

Attributes

name property

The name of the experiment aggregation method.

Functions

__call__

Aggregates a series of experiment results from a specific ExperimentGroup and Metric.

REGaussianAggregator

Bases: ExperimentAggregator

Samples from the Random Effects Meta-Analytical Estimator.

First uses the standard the inverse variance weighted mean and standard errors as model parameters, before debiasing the weights to incorporate inter-experiment heterogeneity. As a result, studies with larger standard errors will be upweighted relative to the fixed-effects model.

Specifically, starting with a Fixed-Effects model \(\mathcal{N}(\tilde{\mu_{\text{FE}}}, \tilde{\sigma_{\text{FE}}})\),

\[\begin{aligned} w_{i}&=\dfrac{\left(\sigma_{i}^2+\tau^2\right)^{-1}}{\sum_{j}^{M}\left(\sigma_{j}^2+\tau^2\right)^{-1}} \\ \tilde{\mu}&=\sum_{i}^{M}w_{i}\mu_{i} \\ \tilde{\sigma^2}&=\dfrac{1}{\sum_{i}^{M}\sigma_{i}^{-2}} \end{aligned}\]

where \(\tau\) is the estimated inter-experiment heterogeneity, and \(M\) is the total number of experiments.

Uses the Paule-Mandel iterative heterogeneity estimator, which does not make a parametric assumption. The more common (but biased) DerSimonian-Laird estimator can also be used by setting paule_mandel_heterogeneity: bool = False.

If hksj_sampling_distribution: bool = True, the aggregated distribution is a more conservative \(t\)-distribution, with degrees of freedom equal to \(M-1\). This is especially more conservative when there are only a few experiments available, and can substantially increase the aggregated distribution's variance.

Assumptions:
  • the individual experiment distributions are normally (Gaussian) distributed
  • there is inter-experiment heterogeneity present
Read more:
  1. Higgins, J., & Thomas, J. (Eds.). (2023). Cochrane handbook for systematic reviews of interventions.
  2. Borenstein et al. (2021). Introduction to meta-analysis.
  3. 'Meta-analysis' on Wikipedia
  4. IntHout, J., Ioannidis, J. P., & Borm, G. F. (2014). The Hartung-Knapp-Sidik-Jonkman method for random effects meta-analysis is straightforward and considerably outperforms the standard DerSimonian-Laird method.
  5. Langan et al. (2019). A comparison of heterogeneity variance estimators in simulated random‐effects meta‐analyses.

Parameters:

  • paule_mandel_heterogeneity (bool, default: True ) –

    whether to use the Paule-Mandel method for estimating inter-experiment heterogeneity, or fallback to the DerSimonian-Laird estimator. Defaults to True.

  • hksj_sampling_distribution (bool, default: False ) –

    whether to use the Hartung-Knapp-Sidik-Jonkman corrected \(t\)-distribition as the aggregate sampling distribution. Defaults to False.

Attributes

name property

The name of the experiment aggregation method.

Functions

__call__

Aggregates a series of experiment results from a specific ExperimentGroup and Metric.

HistogramAggregator

Bases: ExperimentAggregator

Samples from a histogram approximate conflation distribution.

First bins all individual experiment groups, and then computes the product of the probability masses across individual experiments.

Unlike other methods, this does not make a parametric assumption. However, the resulting distribution can 'look' unnatural, and requires overlapping supports within the sample. If any experiment assigns 0 probability mass to any bin, the conflated bin will also contain 0 probability mass.

As such, inter-experiment heterogeneity can be a significant problem.

Uses numpy.histogram_bin_edges to estimate the number of bin edges needed per experiment, and takes the smallest across all experiments for the aggregate distribution.

Assumptions:
  • the individual experiment distributions' supports overlap
Read more:
  1. Hill, T. (2008). Conflations Of Probability Distributions: An Optimal Method For Consolidating Data From Different Experiments.
  2. Hill, T., & Miller, J. (2011). How to combine independent data sets for the same quantity.

Attributes

name property

The name of the experiment aggregation method.

Functions

__call__

Aggregates a series of experiment results from a specific ExperimentGroup and Metric.