Durham University Statistics and Probability Group
Durham University Statistics and Probability

Stats4Grads

Welcome to the Stats4Grads website! Here you will find all the information about the seminar series.

Stats4Grads is a weekly seminar in statistics, organised by and for postgraduate students. We meet on Wednesdays from 1pm-2pm, usually in CM105, with tea, coffee and biscuits provided by the Department of Mathematical Sciences.

Stats4Grads is a great chance to see the other ways postgraduate students use statistics: statisticians can better understand the importance and applications of statistics, how the subject is being used in current "real-world" problems, and what techniques and approaches are needed in the future. Meanwhile, the non-statisticians can learn about new methods and techniques being developed, and get help and insight from students who may have a deeper understanding of the theory behind statistical methods.

Feel free to invite a friend or collaborator from another institution or department to give a talk if they're in town!

Organiser: Clare Wallace. For information or to give a talk contact: clare.wallace@durham.ac.uk .

Stats4Grads Timetable 2018/2019

For details of previous years' seminars, click here.

Finite-dimensional distributions of the height of a renewal model

Speaker: Clare Wallace, Department of Mathematical Sciences
Wednesday 20 March 2019: 1pm, CM105

Suppose we have a collection of blocks with (integer) heights and widths, and we use a random selection of them to build a stick whose total width is $n$.
Working from left to right, we track the cumulative total height at the endpoints of each block. We can linearly interpolate between these endpoints to create a piecewise linear height function for the whole stick.
Under a few assumptions about the distributions of heights and widths of the blocks in our collection, we can write a central limit theorem for the height function at any $k$ points along its width. In particular, we can (almost) prove that the height function, properly rescaled, converges to the trajectories of the Brownian motion.

Avoiding local trapping problems in ABC

Speaker: Kieran Richards, Department of Mathematical Sciences
Wednesday 13 March 2019: 1pm, CM105

Approximate Bayesian Computation is a Bayesian technique that allows us to produce samples from a posterior distribution when the likelihood is intractable or computationally difficult to evaluate. Unfortunately ABC can often suffer from extreme local trapping problems where the sampler does not move for long periods of time and hence produces low quality samples that are of little use. We'll apply some stochastic approximation techniques to attempt to avoid the local trapping problem by penalizing the MCMC sampler for remaining in the same place for long periods of time and instead encouraging it to move more evenly around the sample space. We'll then test our new algorithm with some simulated examples and compare the results with those from standard ABC sampling.

Bayesian approaches to Well Test Analysis

Speaker: Themistoklis Botsas, Department of Mathematical Sciences
Wednesday 6 March 2019: 1pm, CM105

Deconvolution for Well Test Analysis is a methodology that solves an inverse problem associated with petroleum engineering and derives an impulse response function that contains important information about the system.
We use a response function form based on the multi-region radial composite model, known in petroleum literature for its flexibility and ability to resemble almost every plausible shape that can be encountered.
We use an errors-in-variables non-linear Bayesian regression model in order to make inferences about the response function. This allows us to include uncertainty for the independent variables, which is essential in our context due to the large observational error. We combine the likelihood with a set of flexible priors for our parameters and we use MCMC algorithms in order to approximate the posterior.
We illustrate the use of our algorithm by applying it to synthetic and field data sets. The results are comparable in quality to the state of the art solution, which is based on the total least squares method, but our method has several advantages: we gain access to meaningful system parameters associated with the flow behaviour in the reservoir; we can incorporate prior knowledge; and we can quantify parameter uncertainty in a principled way by exploiting the advantages of the Bayesian approach.

Simplify Your Code with %>% (The pipe operator)

Speaker: Miguel Lopez-Cruz,
Wednesday 20 February 2019: 1pm, CM105

Removing duplication is an important principle to keep in mind when you are writing code; however, equally important is to keep your code efficient and readable. Very often, efficiency is achieved by replacing long code sentences for shorter ones in an existing code to make it more readable, clear, and explicit. Consequently, writing code that is simple, readable, and efficient may be considered contradictory. In this talk I want to show how the magrittr R package can help in the efficiency when writing code for analyzing datasets for diverse statistical purposes.

Probabilistic Record Linkage

Speaker: Roger Cox, Department of Engineering
Wednesday 13 February 2019: 1pm, CM105

We read through a paper Roger's been looking at, and try to identify the Python code used to implement the calculations.

LASSO for dimensionality reduction in surrogate model-based Optimisation

Speaker: Lorenzo Gentile, TH Köln, Germany
Wednesday 6 February 2019: 1pm, CM105

Surrogate-Model-based optimization (SMBO) plays a prominent role in today’s modelling, simulation, and optimization processes. It can be considered as the most efficient technique for solving expensive and time-demanding real-world optimization problems. In facts, in many engineering problems, a single evaluation is based on either on experimental or numerical analysis. This causes significant costs with respect to time or resources. SMBO pursues the identification of global optima making advantage of a budget allocation process that maximizes the information gaining in promising regions. In SMBO, a data-driven surrogate model is fitted to replace an expensive computer simulation.

However, high dimensionality leads to severe practical issues in the development of surrogate models. For example, depending on the employed distance measures, it is widely recognized that Kriging, one of the most popular technique, may perform poorly for problems with more than approximately 20 variables. A promising solution for overcoming the SMBO limitations in case of high dimensional search space is the feature selection. Among all, a well-established strategy for selecting important variables is the least absolute shrinkage and selection operator (LASSO). For these reasons, a strategy for enhancing a Kriging based SMBO algorithm, by LASSO is currently under development.

In this presentation, the fundamentals of SMBO will be given. Moreover, preliminary results from the application of the enhanced Kriging based SMBO algorithm to both artificial test functions and real-world application coming from the field of aerospace will be shown.

Improving and benchmarking of algorithms for decision making with lower previsions

Speaker: Nawaphon Nakharutai, Department of Mathematical Sciences, Durham University
Wednesday 30 January 2019: 1pm, CM105

Abstract:

Maximality, interval dominance, and E-admissibility, are three well-known criteria for decision making under severe uncertainty, using lower previsions. I will present a new fast algorithm for finding maximal gambles and compare its performance to existing algorithms, one proposed by Troffaes and Hable (2014), and one by Jansen, Augustin, and Schollmeyer (2017). To do so, I develop a new method for generating random decision problems with pre-specified ratios of maximal and interval dominant gambles.

To find all maximal gambles, Jansen et al. solve one large linear program for each gamble. In Troffaes and Hable, and also in a new algorithm, I do so by solving a larger sequence of smaller linear programs. I find that the primal-dual interior point method works best for solving these linear programs. In this work, based on earlier work, I will present efficient ways to find a common feasible starting point for this sequence of linear programs. I exploit these feasible starting points to develop early stopping criteria for the primal-dual interior point method, further improving efficiency.

I also investigate the use of interval dominance to eliminate non-maximal gambles. This can make the problem smaller, and I observe that this benefits Jansen et al.'s algorithm, but perhaps surprisingly, not the other two algorithms. I find that the new algorithm, without using interval dominance, outperforms all other algorithms in all scenarios in a simulation.

Drunken Heroine Quest 2: Deep Fantasy World Application of the Theory of Random Walks

Speaker: Hugo Lo, Department of Mathematical Sciences, Durham University
Wednesday 12th December 2018: 1pm, CM105

Abstract:

Our research on random walk problems has a lot of useful applications in ecology, psychology, computer science, physics, chemistry, biology as well as economics. However, most of them are too serious for this presentation. Instead, we will guide you through some basics of random walk theory, in a format of a fantasy story... Once upon a time, the brave Edward went on a fearful quest of defeating a dragon to win the heart of the beautiful Dorothy. After falling foul of a curse, Edward is trapped in a skyscraping tower of unknown location in the boundless land of Promenatoria. It is now up to Dorothy to break the curse to free her inamorato. With Edward nowhere to be found, alcohol seems to be the only way for Dorothy to pass the days. Without a particular direction nor a systematic search, a random walk journey begins. Are you ready for this exhilarating and unforgettable adventure? In this second episode of the series we will dive into the feelings deep in our heroine heart. All are welcome.

Bayes linear emulation, decision support and applications to petroleum reservoir models

Speaker: Jonathan Owen, Department of Mathematical Sciences, Durham University
Wednesday 5th December 2018: 1pm, CM105

Abstract:

Complex mathematical computer models are used across many scientific disciplines to improve the understanding of the behaviour of the physical system and provide decision support. These often require the specification of a large number of unknown model parameters; involve a choice of decision parameters; and take a long time to evaluate. Decision support, commonly misrepresented as an optimisation task; often requires a large number of model evaluations rendering traditional methods intractable due to their slow speed. Bayes linear emulators as surrogates provide fast, statistical approximations for computer models, yielding predictions for as yet unevaluated parameter settings, along with a corresponding quantification of uncertainty.

The Integrated Systems Approach for Petroleum Production (ISAPP) is a research program with the aim of increasing hydrocarbon recovery who along with TNO (Netherlands Organisation for Applied Scientific Research) have designed a Field Development Optimisation Challenge centred on a fictitious oil reservoir known as OLYMPUS. This challenge exhibits many of the common issues associated with computer experimentation with further complications arising due to geological uncertainty expressed through an ensemble of 50 models. In this presentation, I will discuss Bayes linear emulators and their use in decision support, before describing some of the difficulties encountered in my work to date on the ISAPP Field Development Optimisation Challenge.

Cost effective component swapping to increase system reliability

Speaker: Aesha Najem, Department of Mathematical Sciences, Durham University
Wednesday 21st November 2018: 1pm, CM105

Abstract:

One of the strategies that might be considered to enhance reliability and resilience of a system is swapping components when a component fails, so replacing it by another component from the system which is still functioning. This presentation considers cost effective component swapping to increase system reliability. The cost is discussed in two scenarios, namely fixed cost and time dependent cost for system failure.

Random set theory for frequentist inferences

Speaker: Daniel Krpelik, Department of Mathematical Sciences, Durham University
Wednesday 14th November 2018: 1pm, CM105

Abstract:

Recently, several inferential methods based on the random sets theory were proposed in the literature. Among those, we would like to focus on Confidence Structures. These can be seen as a generalisation of the inferential approach based on Confidence Distributions. In those, the result of the inference is a probability distribution over the range of parameter of interest which can be used to construct confidence intervals and test hypotheses on any level of significance. Using the random set models allows us to seamlessly derive approaches for analysing censored observations without any assumptions about the underlying censoring model whilst retaining the coverage properties of the confidence distributions. We will show the basic ideas behind the concept of confidence structures and demonstrate its use on reliability analysis of a simple system based on a set of censored observations of lifetimes of its components.

History Matching techniques applied to petroleum reservoir: discussing MCMC as sampling technique

Speaker: Helena Nandi Formentin, Department of Mathematical Sciences, Durham University
Wednesday 7th November 2018: 1pm, CM107

Abstract:

In petroleum engineering, reservoir simulation models are representations of real petroleum fields used in production forecast and decision-making process. Observed dynamic data (e.g. bottom-hole pressure and oil production) support the calibration of reservoir models. We use History Matching techniques to reduce our highly-dimensional input space - which contains parameters such as porosity and permeability and fluid properties - through the assimilation of measured data. We use emulation techniques to explore a simplified simulator of a reservoir model, and HM processes to reduce the simulator’s input space. In this section, we will discuss MCMC techniques applied to sample in a reduced and complex space.

Maintenance Record Labelling of Wind Turbine Data for Fault Prognosis

Speaker: Roger Cox, Department of Engineering, Durham University
Wednesday 31st October, 2018: 1pm, CM105

Abstract:

A set of methods are being developed for the determination of the health history of mechanical plant. These are to be applied both in offshore wind turbine maintenance trouble shooting (fault diagnosis) and in condition based maintenance (fault prognosis). One of the methods used is Bernoulli Naive Bayes classification.

Wednesday 24th October 2018:

1pm, CM105

Introduction to Stats4Grads

Speaker: everyone!

Abstract

Come along to CM105 on Wednesday 24th October at 1pm to get to know your fellow statisticians! We'll introduce ourselves and our research area (briefly!), and then just have an informal chat. Of course, the introductory meeting wouldn't be complete without free pizza ;)

Back to the Statistics Seminar list.