Bayesian Analysis of Multilevel Complex Models of Physical Systems

Supervisor: Ian Vernon | Research Area: Statistics

Background

Many major scientific disciplines now employ detailed mathematical models to describe complex physical systems of interest, for example, galaxy formation models are used to understand structure formation in our universe, climate models are used to study and predict global warming, UK energy distribution models are used to plan to ensure the provision of sufficient UK power supply and epidemiology models are used to predict and control the development of epidemics. However, to use such models for understanding, prediction and subsequent decision making, a full (Bayesian) uncertainty analysis should be performed, a process now referred to as ``Uncertainty Quantification".

However, many of these scientific models are complex, take significant time to evaluate and have several unknown input parameters. The large evaluation time in particular precludes the use of standard Bayesian approaches for parameter inference, system prediction and decision support. A solution to this problem is to construct a Bayesian emulator: a powerful Bayesian statistical construct that mimics the slow scientific model but which is often several orders of magnitude faster to evaluate (now very popular across statistics and machine learning). Emulators can then be used to perform all the required Bayesian calculations.

Often, there exist two (or more) models of the physical system, one fast but inaccurate, and the other slow but more precise. Using the fast model to aid construction of the emulator of the slow model is a very powerful and widely applicable approach, which will form the basis of this project.

An example of the output of a complex model of galaxy formation (known as the EAGLE model) used to understand the evolution of our universe. You will learn to construct emulators for models like this.

Group project

The group project will involve understanding core techniques in emulator construction, centred around Bayesian regression, and their application to a two-level complex model of a physical system (e.g. disease models, galaxy formation models etc.).

By the end of the group project, you will have knowledge about:

By the end of this group project, you will be able to:

Mode of operation and evidence of learning

The project will be based on reading suitable material, some statistical paper-and-pencil derivations, and some programming tasks using R (for both calculation and visualisations). The project will have a (hopefully) pleasant balance between learning new statistical techniques, coding them and using them to investigate the physical model of interest.

An example of an emulator used to perform optimisation. You will learn to build and employ emulators like this.

Individual project

The individual project will build on the knowledge we have gained in the group project and will explore additional advanced topics. Examples of directions we can investigate are:

Mode of operation and evidence of learning

Adding to the corresponding section from the Group project, the Individual project will involve at least one of

Prerequisites and Co-requisites

Additional information

If you would like more information about this project, please contact me at i.r.vernon@durham.ac.uk.

Resources

For an introduction to emulation as applied to a complex model of Galaxy Formation see our paper entitled "Galaxy Formation: Bayesian History Matching for the Observable Universe" which can be found at
Statistical Science.

For more of a tutorial in building emulators see "Bayesian uncertainty analysis for complex systems biology models: emulation, global parameter searches and evaluation of gene functions", which can be found here, although note that in the group project we will be focussing on the regression part of the emulator, possibly extending to the full emulator in the individual project.

A tutorial in some of the concepts around using emulators for optimisation can be found in the first half of "A Tutorial on Bayesian Optimisation" although don't worry about some of the more technical aspects: we will do things in a more efficient (and simpler!) way.