‘I checked it very thoroughly,’ said the computer, ‘and that quite
definitely is the answer. I think the problem, to be quite honest with
you, is that you’ve never actually known what the question is.’
Douglas Adams The Hitchhiker’s Guide to the Galaxy, 1979,
Chapter 28
When you are a Bear of Very Little Brain, and you Think of Things,
you find sometimes that a Thing which seemed very Thingish inside you is
quite different when it gets out into the open and has other people
looking at it.
A.A. Milne as Winnie-the-Pooh in The House at Pooh Corner,
1928
Surrogate Modelling in Practice
Emergent phenomena require computational power.
In surrogate modelling we use statistical/ML models to learn
regularities in those emergent phenomena.
Types of Simulations
Simulation can be differential equation models
Either abstracted (like epidemilogical models).
Or fine-grained (like climate/weather).
Discrete event simulations
Either turn based (e.g. Game of Life or F1 Strategy
Simulations)
Or Event based (e.g. Gillespie algorithm for chemical models
Fidelity of the Simulation
Simulations work at different fidelities.
e.g. difference between strategy simulation in F1 and aerodynamics
simulation
Epidemiology
Modelling Herd Immunity
\[
\frac{\text{d}{S}}{\text{d}t} = - \beta S (I_1 + I_2).
\]
We don’t know what science we’ll want to do in five years’ time, but
we won’t want slower experiments, we won’t want more expensive
experiments and we won’t want a narrower selection of experiments.
What do we want?
Faster, cheaper and more diverse experiments.
Better ecosystems for experimentation.
Data oriented architectures.
Data maturity assessments.
Data readiness levels.
Packing Problems
Packing Problems
Packing Problems
Modelling with a Function
What if question of interest is simple?
For example in packing problem: what is minimum side length?
Alternatively, we can fit a GP model and compute the integral of the
best predictor by Monte Carlo sampling.
Branin Function Fit
Uncertainty Quantification and Design of Experiments
History of interest, see e.g. McKay et al. (1979)
The review:
Random Sampling
Stratified Sampling
Latin Hypercube Sampling
As approaches for Monte Carlo estimates
Random Sampling
{Random sampling is the default approach, this is where across the
input domain of interest, we just choose to select samples randomly
(perhaps uniformly, or if we believe there’s an underlying
distribution
Let the input values \(\mathbf{ x}_1,
\dots, \mathbf{ x}_n\) be a random sample from \(f(\mathbf{ x})\). This method of sampling
is perhaps the most obvious, and an entire body of statistical
literature may be used in making inferences regarding the distribution
of \(Y(t)\).
Stratified Sampling
Using stratified sampling, all areas of the sample space of \(\mathbf{ x}\) are represented by input
values. Let the sample space \(S\) of
\(\mathbf{ x}\) be partitioned into
\(I\) disjoint strata \(S_t\). Let \(\pi
= P(\mathbf{ x}\in S_i)\) represent the size of \(S_i\). Obtain a random sample \(\mathbf{ x}_{ij}\), \(j = 1, \dots, n\) from \(S_i\). Then of course the \(n_i\) sum to \(n\). If \(I =
1\), we have random sampling over the entire sample space.
Latin Hypercube Sampling
The same reasoning that led to stratified sampling, ensuring that all
portions of \(S\) were sampled, could
lead further. If we wish to ensure also that each of the input variables
\(\mathbf{ x}_k\) has all portions of
its distribution represented by input values, we can divide the range of
each \(\mathbf{ x}_k\) into \(n\) strata of equal marginal probability
\(1/n\), and sample once from each
stratum. Let this sample be \(\mathbf{
x}_{kj}\), \(j = 1, \dots, n\).
These form the \(\mathbf{ x}_k\)
component, \(k = 1, \dots , K\), in
\(\mathbf{ x}_i\), \(i = 1, \dots, n\). The components of the
various \(\mathbf{ x}_k\)’s are matched
at random. This method of selecting input values is an extension of
quota sampling (Steinberg 1963), and can be viewed as a \(K\)-dimensional extension of Latin square
sampling (Raj 1968).