Sensitivity Analysis

Neil D. Lawrence

Emukit Sensitivity Analysis

A possible definition of sensitivity analysis is the following: The study of how uncertainty in the output of a model (numerical or otherwise) can be apportioned to different sources of uncertainty in the model input (Saltelli et al., 2004). A related practice is ‘uncertainty analysis’, which focuses rather on quantifying uncertainty in model output. Ideally, uncertainty and sensitivity analyses should be run in tandem, with uncertainty analysis preceding in current practice.

In Chapter 1 of Saltelli et al. (2008)

What is Sensitivity Analysis?

  • Study of how output uncertainty relates to input uncertainty
  • Determines which inputs contribute most to output variations
  • Complements uncertainty analysis which quantifies output uncertainty
  • Best practice: Run uncertainty analysis first, then sensitivity analysis

Local Sensitivity Analysis

  • Examines function sensitivity around a specific point
  • Uses partial derivatives (Jacobian matrix)
  • Only valid near operating point
  • Limited view - doesn’t capture global behavior
  • Useful for small perturbations around operating point

Global Sensitivity Analysis

  • Examines sensitivity across entire input domain
  • Uses ANOVA/Hoeffding-Sobol decomposition
  • Based on total variance of function
  • Requires assumptions about input distributions
  • More comprehensive than local analysis

The Maths

  • Total variance of function: \[\text{var}\left(g(\mathbf{ x})\right) = \left\langle g(\mathbf{ x})^2 \right\rangle _{p(\mathbf{ x})} - \left\langle g(\mathbf{ x}) \right\rangle _{p(\mathbf{ x})}^2\]

  • Expectation defined as: \[\left\langle h(\mathbf{ x}) \right\rangle _{p(\mathbf{ x})} = \int_\mathbf{ x}h(\mathbf{ x}) p(\mathbf{ x}) \text{d}\mathbf{ x}\]

  • \(p(\mathbf{ x})\) represents probability distribution of inputs

Input Density

  • Assume inputs are independent
  • Each input uniformly distributed
  • Scale inputs to [0,1] interval
  • Simplifies analysis while maintaining generality

Input Density Mathematics

  • Independent inputs means: \[p(\mathbf{ x}) = \prod_{i=1}^pp(x_i)\]
  • Uniform distribution on [0,1]: \[x_i \sim \mathcal{U}\left(0,1\right)\]

Hoeffding-Sobol Decomposition

Decomposition

  • Function decomposed into sum of terms: \[g(\mathbf{ x}) = g_0 + \sum_{i=1}^pg_i(x_i) + \sum_{i<j}^{p} g_{ij}(x_i,x_j)\] \[+ \sum_{i<j<k}^{p} g_{ijk}(x_i,x_j,x_k) + \cdots + g_{1,2,\dots,p}(x_1,\dots,x_p)\]

Decomposition Terms

  • Terms represent:
    • Constant (\(g_0\))
    • Individual effects (\(g_i\))
    • Interaction effects (\(g_{ij}\))
    • Higher-order interactions

Base Terms

  • Constant term is overall expectation: \[g_0 = \left\langle g(\mathbf{ x}) \right\rangle _{p(\mathbf{ x})}\]
  • Individual effects: \[g_i(x_i) = \left\langle g(\mathbf{ x}) \right\rangle _{p(\mathbf{ x}_{\sim i})} - g_0\]
  • Where \(p(\mathbf{ x}_{\sim i})\) means marginalizing out \(i\)th variable: \[p(\mathbf{ x}_{\sim i}) = \int p(\mathbf{ x}) \text{d}x_i\]

Interaction Terms

  • Two-way interactions: \[g_{i,j}(x_i, x_j) = \left\langle g(\mathbf{ x}) \right\rangle _{p(\mathbf{ x}_{\sim i,j})} - g_i - g_j - g_0\]
  • Higher order terms follow similar pattern
  • Each term requires computing lower-order terms first

{The Hoeffding-Sobol, or ANOVA, decomposition of a function allows us to write it as, \[ \begin{align*} g(\mathbf{ x}) = & g_0 + \sum_{i=1}^pg_i(x_i) + \sum_{i<j}^{p} g_{ij}(x_i,x_j) + \cdots \\ & + g_{1,2,\dots,p}(x_1,x_2,\dots,x_p), \end{align*} \] where \[ g_0 = \expectationDist{g(\mathbf{ x})}{p(\mathbf{ x}) \] and \[ g_i(x_i) = \left\langle g(\mathbf{ x}) \right\rangle _{p(\mathbf{ x}_{\sim i})} - g_0, \] where we’re using the notation \(p(\mathbf{ x}_{\sim i})\) to represent the input distribution with the \(i\)th variable marginalised, \[ p(\mathbf{ x}_{\sim i}) = \int p(\mathbf{ x}) \text{d}x_i \] Higher order terms in the decomposition represent interactions between inputs, \[ g_{i,j}(x_i, x_j) = \left\langle g(\mathbf{ x}) \right\rangle _{p(\mathbf{ x}_{\sim i,j})} - g_i(x_i) - g_j(x_j) - g_0 \] and similar expressions can be written for higher order terms up to \(g_{1,2,\dots,p}(\mathbf{ x})\).}

Variance Decomposition

  • Sobol decomposition components are orthogonal
  • Leads to variance decomposition: \[\text{var}(g) = \left\langle g(\mathbf{ x})^2 \right\rangle _{p(\mathbf{ x})} - g_0^2\]

ANOVA Decomposition

  • Decomposes into sum of variance terms: \[\text{var}(g) = \sum_{i=1}^p\text{var}\left(g_i(x_i)\right) + \sum_{i<j}^{p} \text{var}\left(g_{ij}(x_i,x_j)\right) + \cdots\]
  • Known as ANOVA (Analysis of Variance) decomposition
  • Each term represents variance from different input interactions

Sobol Indices

  • Rescale variance components by total variance
  • Gives Sobol indices: \[S_\ell = \frac{\text{var}\left(g(\mathbf{ x}_\ell)\right)}{\text{var}\left(g(\mathbf{ x})\right)}\]
  • \(\ell\) represents different input combinations
  • See Durrande et al. (2013) for elegant approach using covariance structure

Sobol Indices: The Intuition

  • Sobol indices tell us “how much each input matters”
  • Values between 0 and 1:
    • 0 means input has no effect
    • 1 means output variance entirely due to this input
  • Can analyze:
    • Individual inputs (\(S_i\))
    • Pairs of inputs (\(S_{ij}\))
    • Higher-order interactions
  • Sum of all indices = 1

Example: the Ishigami function

Ishigami Function Example: Overview

  • Will explore sensitivity analysis using Ishigami test function
  • Selected because it allows exact calculation of Sobol indices
  • In real applications, exact calculation rarely possible
  • Will demonstrate progression of methods:
    1. Exact calculation (possible for Ishigami)
    2. Monte Carlo estimation (general but expensive)
    3. GP emulation (efficient approximation)

Ishigami Function

\[ g(\textbf{x}) = \sin(x_1) + a \sin^2(x_2) + b x_3^4 \sin(x_1). \]

Total Variance

First Order Sobol Indices using Monte Carlo

\[ S_i = \frac{\text{var}\left(g_i(x_i)\right)}{\text{var}\left(g(\mathbf{ x})\right)}. \]

  • First plot compares true Sobol indices with Monte Carlo estimates
  • Shows relative importance of each input variable
  • Demonstrates accuracy of Monte Carlo sampling approach

Total Effects Using Monte Carlo

  • Next plot shows total effects - including all variable interactions
  • Compares true values with Monte Carlo estimates
  • Demonstrates how variables influence output through interactions

Computing the Sensitivity Indices Using the Output of a Model

  • Final plots compare three approaches:
    • True Sobol indices
    • Direct Monte Carlo estimation
    • GP-based estimation
  • Shows how GP emulator can approximate sensitivity with fewer samples

Conclusions

  • Sobol indices tool for explaining variance of output as coponents of input variables.

Catapult Simulation

Nicolas Durrande

Catapult Parameters

  • Four key parameters control the catapult:
    • rotation_axis: Axis the arm rotates around
    • arm_stop: Position where arm stops
    • spring_binding_1: First spring attachment point
    • spring_binding_2: Second spring attachment point

Parameter Vector

  • Parameters combined into vector:

\[ \mathbf{ x}_i = \begin{bmatrix} \texttt{rotation_axis} \\ \texttt{arm_stop} \\ \texttt{spring_binding_1} \\ \texttt{spring_binding_2} \end{bmatrix} \]

Running Experiments

  • Fire catapult with chosen parameters
  • Record distance traveled
  • Helper function requests distance for each configuration

Parameter Space

  • All parameters scaled to [0,1] range
  • Define continuous parameter space for optimization

Experimental Design for the Catapult

  • First build an experimental design loop
  • Start with model-free design (random, Latin hypercube, Sobol, orthogonal)
  • Initialize with 5 random samples
  • Build Gaussian process model
  • Use model to guide further sampling

Model Based Design

  • Two sampling strategies:
  • Integrated variance reduction (minimize overall uncertainty)
  • Uncertainty sampling (sample highest variance points)
  • Trade-off between accuracy and computation speed

Sensitivity Analysis of a Catapult Simulation

  • We’ll use Monte Carlo sensitivity analysis to understand which parameters matter most
  • This helps us identify which parameters we should focus on when optimizing the catapult
  • We’ll use our GP emulator to efficiently estimate Sobol indices
  • Two types of indices:
    • First order - direct influence of each parameter
    • Total effects - includes parameter interactions

First Order Sobol Indices

  • First order Sobol indices show direct parameter effects
  • Higher percentage = parameter has stronger direct influence on catapult distance
  • Helps identify which parameters to prioritize for optimization

First Order Sobol Indices

Total Effects Sobol Indices

  • Total effects include both direct influence and parameter interactions
  • Larger difference between total and first order effects indicates strong parameter interactions
  • Understanding interactions helps with joint parameter optimization

Total Effects Sobol Indices

Thanks!

References

Durrande, N., Ginsbourger, D., Olivier, Carraro, L., 2013. ANOVA kernels and RKHS of zero mean functions for model-based sensitivity analysis. Journal of Multivariate Analysis 115, 57–67. https://doi.org/https://doi.org/10.1016/j.jmva.2012.08.016
Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M., Tarantola, S., 2008. Global sensitivity analysis: The primer. wiley.
Saltelli, A., Tarantola, S., Campolongo, F., Ratto, M., 2004. Sensitivity analysis in practice: A guide to assessing scientific methods. wiley.