1 Overview

This lecture will think about linking the SIR model we developed in Lectures 1 and 2 to the real world.

Estimating \(R_0\) from early data
A Monte Carlo approach to parameter estimation
Using optim to calibrate an SIR model

These could easily each be a lecture (or more!) on their own, so please ask questions at any time!

2 Linking to the real world

Our model structure (SIR, SEIRS etc) should relate to the real world
(Think of all the questions we considered)

But the compartment populations vary hugely with \(\beta\) and \(\gamma\):

The measures we’ve discussed have simple definitions but are impossible to measure:

\(\beta\) and \(\gamma\)
\(R_0\) and \(R_n\)

Can we use our model and real world data to understand

What’s happening?

What’s likely to happen next?

3 The early stages of an epidemic

In this section:

The growth rate \(\Lambda\)
Using the growth rate to learn about \(R_0\)
Example: Covid in early 2020

The early stages of an epidemic

Using early stage data, we can estimate \(R_0\).
This relies on the assumption that almost all the population is susceptible.
This helps us to understand:

How transmissible the disease is
The herd immunity threshold

To estimate \(R_0\), we first need to calculate the growth rate.

3.1 The growth rate of an epidemic

Key idea:

During the first stages of an epidemic, the rate of increase in infectious individuals should be approximately constant.

In our SIR model: \[ \frac{dI}{dt} = \frac{\beta I S}{N} - \gamma I. \] At the start of the epidemic, we have \(S\approx{N}\), and so approximately \[ \frac{dI\left(t\right)}{dt} = \beta I\left(t\right) - \gamma I\left(t\right) = \left(\beta - \gamma\right) I\left(t\right), \]

Therefore an approximate solution is \[ I\left(t\right) \approx I\left(0\right)e^{\left(\gamma-\beta\right)t} \]

The growth rate of an epidemic

More generally (not assuming a simple SIR model) we write

\[ I\left(t\right) \approx I\left(0\right)e^{\Lambda t}, \] where \(\Lambda\) is the growth rate.

Taking logs, we therefore have \[ \ln\left(I\left(t\right)\right) \approx \ln\left(I\left(0\right)\right) + \Lambda t. \] This is a straight line, with gradient \(\Lambda\).

Therefore, if we plot \(\ln{\left[I\left(t\right)\right]}\), we should see a straight line with gradient \(\Lambda\).

Estimating the growth rate

But

What if we don’t know the number of infectious individuals at each time step? (normally we won’t)

At log scale, the cumulative case count increases at a similar rate early on:

Natural logs of the I population, and the cumulative sum of I.

3.1.1 Scale invariance

Another problem

Incidences of an infectious disease are often under-reported (especially early on).

Say, \[I_{reported} = k_r I_{true} \text{ where } 0<k_r<1.\] Our estimate of the growth-rate \(\Lambda\) is unaffected: \[ \begin{align*} I_{true}\left(t\right) & \approx I_{true}\left(0\right)e^{\Lambda t}\\ & \approx \frac{1}{k_r}I_{reported}\left(0\right) e^{\Lambda t}, \end{align*} \] and therefore \[\begin{align*} \ln \left[I_{true}\left(0\right)\right] & \approx \ln{\left[\frac{1}{k_r}I_{reported}\left(0\right) e^{\Lambda t}\right]}\\ &\approx{\ln\left(\frac{1}{k_r}I_{reported}\left(0\right)\right) + \Lambda t}. \end{align*}\]

The intercept of the straight line changes, but not the gradient \(\Lambda\).

3.1.2 Example: Covid-19 in the UK - new cases

From the UK government website you can download daily case data for Covid-19 in the UK.

New daily cases and cumulative counts, for each of England, Scotland, Wales and Northern Ireland.

Plotting the time series for the whole of the pandemic up to January 2022 we see

Daily new Covid-19 cases in the UK from early 2020 to present.

Cumulative Covid-19 cases in the UK from early 2020 to present.

Example: plotting the logs

Plotting the natural log, for just the first month (March 2020):

Natural logarithms of cumulative Covid-19 cases in the UK in early 2020.

Example: fitting straight lines

Fit straight lines to each of these using linear regression:

Natural logarithms of cumulative Covid-19 cases in the UK in early 2020.

The gradients and their standard errors are:

Coefficients from the fitted linear models.
Region	Coefficients	SE
England	0.224	0.00461
Northern Ireland	0.194	0.00823
Scotland	0.229	0.00940
Wales	0.277	0.00992

The ‘Coefficients’ column gives the estimate of the growth rate \(\Lambda\).

3.2 Using the growth rate to estimate \(R_0\)

There are many ways to estimate \(R_0\) from \(\Lambda\)

Relying on different model assumptions
Using different summaries of the disease’s behaviour

For example many assume that the pre-infectious and/or infectious periods follow an exponential distribution (which we also do when we use rate parameters).

One such estimate is

\[ \begin{equation} R_0 \approx 1 + \Lambda T_S \end{equation} \]

(\(T_S\) is the serial interval, between the onsets of successive cases).

Another is

\[\begin{equation} R_0\approx \left(1+\Lambda D\right)\left(1+\Lambda D'\right), \end{equation}\] where \(D\) and \(D'\) are the durations of the infectious and pre-infectious (latent) periods, respectively.

3.2.1 Example continued: Covid-19 in the UK

Nishiura, Linton, and Akhmetzhanov (2020) estimate \(T_s\) of Covid-19 to be 4.6 with a 95% credible interval of \(\left(3.5, 5.9\right)\).

We can use these with our estimates of \(\Lambda\) to estimate \(R_0\):

Estimates of R[0] using the median and 95% credible interval from Nishiura, Linton, and Akhmetzhanov (2020).
Region	Coefficients	SE	R0_lower95	R0_median	R0_upper95
England	0.224	0.00461	1.7840	2.0304	2.3216
Northern Ireland	0.194	0.00823	1.6790	1.8924	2.1446
Scotland	0.229	0.00940	1.8015	2.0534	2.3511
Wales	0.277	0.00992	1.9695	2.2742	2.6343

Note: this ignores uncertainty in the estimates of \(\Lambda\).

4 Estimating parameters by model fitting

In this section:

SIR parameter space
Moving around parameter space to find the ‘best’ point
Example: mumps in a virgin population

Estimating parameters by model fitting

We can also use outbreak data to find appropriate values for the parameters of an SIR-type model.

We call this calibration (a general model-fitting term)

Parameter space

If a model has \(n\) parameters, we can think of them as living in an \(n\)-dimension space: the parameter space (or sometimes input space or model space)

Each point in parameter space will correspond to a particular set of parameter values.

Difficult to visualise for large \(n\).

4.1 SIR parameters in parameter space

Think of the SIR model with two parameters (so \(n=2\)):

the per capita transmission rate \(\beta\)
the recovery rate \(\gamma\).

We can see the point \(\left(\beta=1.25,\,\gamma=0.5\right)\) in parameter space, and the corresponding output:

The parameter values from our example, plotted in 2D parameter space.

The model output corresponding to the parameter values from this example.

4.2 Moving around parameter space

How does the output change for as we move around parameter space?

To answer this question, we need to choose some points.

But we have to be careful how we choose them…

Problem: We won’t about what happens when \(\beta\) and \(\gamma\) are varied independently

Problem: If one parameter turns to have little/no effect, we effectively have only 3 points.

4.2.1 Latin hypercube

To build a \(p\)-point latin hypercube design (LHD) we

Divide each dimension (parameter) into \(p\) equal intervals.
Allocate \(p\) points such that there is exactly one point in each interval.

This means the diagonal design is in fact an LHD, so we add another constraint.

A maximin LHD is an LHD where the minimum distance between points has been maximised.

Now we have

A space-filling design
\(p\) distinct values for each parameter

10-point latin hypercube

We can run the SIR model for each \(\left(\beta,\,\gamma\right)\) pair in our 10-point maximin latin hypercube:

A maximin LHS of size 10.

Infectious populations corresponding to the 10 points.

This is a Monte Carlo approach, where we use random sampling to generate results.

4.3 Example: Mumps in a virgin population

Philip, Reinhard, and Lackman (1959) reported on two outbreaks of Mumps in 1957, on St. Lawrence Island

Visited very infrequently from either mainland (at the time).
Two villages: Gambell (1957 pop-n 302) and Savoonga (1957 pop-n 259)
No prior outbreak of mumps was known there

St. Lawrence Island

Since mumps is immunizing and the populations started as susceptible, this dataset should work well with an SIR model.

Let’s try to find appropriate values of \(\beta\) and \(\gamma\)

Working with real outbreak data

The data are weekly counts of new incidences - this is not the same as \(I\left(t\right)\) (or any of our other compartments).

We need to create an analagous quantity we can calculate using the SIR model

As people are newly infected they leave the \(S\) compartment.
Each infected person will at some point move from \(I\) to \(R\)
We can keep a cumulative total of infections \[ I_t^{cum} = N - S_t.\]

##           week cases cum_cases  Village time_days
## 141 1957-03-30     1         1 Savoonga         1
## 151 1957-04-06     0         1 Savoonga         8
## 161 1957-04-13     6         7 Savoonga        15
## 171 1957-04-20     9        16 Savoonga        22
## 181 1957-04-27     3        19 Savoonga        29
## 191 1957-05-04    20        39 Savoonga        36
## 201 1957-05-11    26        65 Savoonga        43
## 211 1957-05-18    31        96 Savoonga        50
## 221 1957-05-25    28       124 Savoonga        57
## 231 1957-06-01    21       145 Savoonga        64
## 241 1957-06-08     8       153 Savoonga        71
## 251 1957-06-15    10       163 Savoonga        78
## 261 1957-06-22     1       164 Savoonga        85
## 271 1957-06-29     0       164 Savoonga        92
## 281 1957-07-06     0       164 Savoonga        99
## 291 1957-07-13     0       164 Savoonga       106
## 301 1957-07-20     0       164 Savoonga       113
## 311 1957-07-27     0       164 Savoonga       120
## 321 1957-08-03     0       164 Savoonga       127
## 331 1957-08-10     0       164 Savoonga       134
## 341 1957-08-17     0       164 Savoonga       141
## 351 1957-08-24     0       164 Savoonga       148

Example: Savoonga dataset

Weekly counts of new incidences - \(I\left(t\right)\) unknown.

We can generate a cumulative sum of new infections

Cumulative case count for Savoonga, 1957.

Our goal is to find values for \(\beta,\,\gamma\) that lead to a good fit.

4.3.1 Goodness of fit

Goal: find values for \(\beta\) and \(\gamma\) that produce an \(R\) curve that is as close as possible to the actual data.

To do this, we need some measure of the goodness-of-fit of model output to the data.

We use the measure to assess the fit across the parameter space, and thus find the best point.

The measure we will use is the root mean squared error, RMSE.

There are many MANY other ways to assess model fit!

Root mean squared error (RMSE)

The root mean squared error is defined as \[ RMSE = \sqrt{\frac{1}{n}\sum\limits_{i=1}^{n}{\left(\hat{y}_i - y_i\right)^2}}. \]

\(n\) is the number of observed values

\(y_i\), for \(1\leq{i}\leq{n}\), are the observed values

\(\hat{y}_i\) are the corresponding model predictions

The RMSE gives the root of the mean squared distance between model and data, and is on the same scale as the data.

The smaller the RMSE, the better the fit of the model to the data.

Example: RMSE

Let’s set \(\beta=1.1,\,\gamma=0.75\), and compare the resulting model curve to the Savoonga data.

Cumulative case count for Savoonga, 1957 (blue), with the model produced by beta=1.1, gamma=0.75 (black), and the distance between each pair of observed and predicted values marked in red.

\[ RMSE = \sqrt{\frac{1}{n}\sum\limits_{i=1}^{n}{\left(\hat{y}_i - y_i\right)^2}}. \]

4.3.2 Example: Exploring the space

Sampling approach

Generate a selection of parameter values that cover the space well
Run the model for each point and calculate RMSE
Identify which regions of parameter space give a better fit.

Example: 1000 point maximin LHD

## [1] "Smallest RMSE is 9.33929 at beta=0.216393, gamma=0.127752"

1000 points in SIR parameter space, coloured by the RMSE as a measure of fit to the Savoonga mumps data.

Notice that the best pairs of values for \(\beta\) and \(\gamma\) lie on a straight line, representing the same \(R_0\) value (remember that for this model \(R_0= \frac{\beta}{\gamma}\))

Example: The best fit (from the sample)

Cumulative case count for Savoonga, 1957 (blue), with the model produced by best pair of parameter values found.

This is still not the best fit - it is the best from the 1,000 candidate models we generated.

4.3.3 Focussing in

Based on the best parameter values, we will refocus on the region \(\beta \in [0.23,\,0.3],\;\gamma \in [0.125,\,0.2]\).

## [1] "Smallest RMSE is 5.0837 at beta=0.247978, gamma=0.154183"

Model fit from zoomed-in LHD

We can plot the \(I^{cum}_t\) output for the SIR model with these parameters, to see the fit.

Cumulative case count for Savoonga, 1957 (blue), with the model produced by best pair of parameter values found in the refocussed region.

If we wanted to, we could repeat this process to gain an even better fit.

4.4 Some caveats!

This process is subject to many uncertainties, errors and problems. Here are a few:

The ‘best’ parameter values are only equivalent to the true system values if the model is perfect - ie. never.
The process depends on the ranges chosen and the random values generated for the input design.
If the best found value lies very close to the boundary, it’s possible that a better value lies outside the chosen region.
The choice of final time point is somewhat arbitrary

5 The Optimization approach

In this section:

Optimization
The Nelder-Mead algorithm - a very quick intro!
Example: mumps again

The Optimization approach

Rather than evaluating the SIR model and RMSE hundreds or thousands of times, we can use a step-by-step approach:

Start somewhere in parameter space
Algorithm proposes next point
Iterate until no improvement can be made.

There are many optimisation methods - we will look at Nelder-Mead.

5.1 Optimisation using Nelder-Mead

Optimisation methods usually require a function

\[ f: \mathbb{R}^n \rightarrow{\mathbb{R}}.\]

Single-valued summary of any parameter value
Measuring performance in some way

In our case,

The input space is \(\left\lbrace \left(\beta,\,\gamma\right)\in\mathbb{R}^2: \beta>0, \gamma>0\right\rbrace\)
The output space is \(\mathbb{R}^+\), the RMSE of the SIR output compared to observed infection data.

Nelder-Mead

Nelder-Mead starts with a randomly generated simplex in \(\mathbb{R}^n\).

A simplex is the simplest possible shape spanning all \(n\) dimensions, and has \(n+1\) vertices. A 2-simplex is a triangle.

The function is evaluated at each vertex, and the worst \(\left(x_h\right)\), second worst \(\left(x_s\right)\) and best \(\left(x_l\right)\) are labelled.

Then several new points are proposed.

Nelder-Mead: new point

Reflect: The worst point (ie. that with the highest value of \(f\)) is reflected through the midpoint of the line between the second worst and best points, generating a new point \(x_r\). If \(f\left(x_l\right)\leq{f\left(x_r\right)}<f\left(x_s\right)\) (ie. the new point’s performance is between the best and second-best), accept \(x_r\).

Reflection

Expand: If \(f\left(x_r\right) < f\left(x_l\right)\) (ie. the new point outperforms the best point), extend \(x_r\) for twice the distance to \(x_e\).

Expansion

Nelder-Mead: Changes

Contract: if \(f\left(x_r\right) \geq{f\left(x_s\right)}\) (ie. the new point is still no better than the second best), find the contraction point \(x_c\) using the better of the points \(x_r\) (outside) and \(x_e\) (inside).

Contraction
Shrink: Compute \(n\) new vertices that shrink the simplex towards \(x_l\)

Shrinking

5.1.1 A stopping criterion

With the four instructions above, the algorithm would continue indefinitely; it needs to know when to stop and return the ‘optimal’ point.

In general the Nelder-Mead algorithm terminates when either

The simplex is small enough / the vertices are close enough together - in which case it returns the current best point.
Some maximum number of iterations has been reached - in which case it reports that is has been unable to find an optimal point.

Video

5.2 Optim: Mumps example

Starting point \(\beta = 0.6,\,\gamma=0.6\).

Optim example - fit to data

Slightly better than before

5.2.1 Optimisation - some thoughts

When it works, Nelder-Mead is much more efficient than Monte Carlo (57 evaluations compared to 1,000)
- This benefit is increased in higher dimensions
Nelder-Mead isn’t constrained to a specific region of parameter space
Nelder-Mead is less robust than Monte Carlo - you can see it gets stuck for a while on the ridge - and there is no guarantee it will end up in the best place.

6 Summary

In this lecture we have thought about using observed data with SIR-type models.

In the early stages of an epidemic we can estimate the growth rate \(\Lambda\), and use this to estimate the basic reproduction number \(R_0\).
We can calibrate the parameters of an SIR model to observed data once an epidemic has passed.
We have done this using a sampling (Monte Carlo) approach and an optimisation method (Nelder-Mead).

In the workshop we will develop these ideas more.

7 References

Nishiura, Hiroshi, Natalie M Linton, and Andrei R Akhmetzhanov. 2020. “Serial Interval of Novel Coronavirus (COVID-19) Infections.” International Journal of Infectious Diseases 93: 284–86.

Philip, BN, Karl R Reinhard, and DB Lackman. 1959. “Observations on a Mumps Epidemic in a "Virgin" Population.” American Journal of Epidemiology 69 (2): 91–111.

Models and Methods in Health Data Science

Lecture 3: Linking SIR models to the real world