In this workshop, we will
heemod
and learn how to build a Markov modelYou can do much of this fairly simply using matrix multiplication in R, but there are packages available to format things nicely and to produce plots. We will use the package heemod
, but before looking into the details of that we will see how Markov models work.
The code for sections 3 and 4 is included in the file ‘MMHDS_workshop4.R’, which you can find from the module webpage or from the Resources section in Ultra.
Markov models are a type of cohort simulation, meaning they treat everyone in a particular cohort as equivalent. They are often used for long-term modelling of chronic conditions.
In a Markov model, there are a finite number of health states, which are exclusive and exhaustive; everyone must belong to one and only of the states at any time. Each health state is associated with
We will focus on the probabilities of transitioning to each other state (or remaining in the same state), which we store in a transition matrix.
For example, we can build a Markov model with three health states: Healthy, ill and dead. A possible transition matrix is shown below. Note that it is an upper triangular matrix - all entries below the diagonal are zero, and therefore not shown. This reflects the idea that health states can remain the same, or deteriorate, but not improve. This does not have to be the case mathematically, but it usually is in Markov models for health economics.
## healthy ill dead
## healthy 0.97 0.02 0.01
## ill 0.00 0.96 0.04
## dead 0.00 0.00 1.00
Each row shows the transition probabilities for that state - the probabilities of being in each of the different states at the following timestep, or cycle (we will use ‘timestep’ and ‘cycle’ interchangeably). For example, somebody who is in the ‘healthy’ state has a probability of 0.97 of remaining healthy in the next cycle, a probability of 0.02 of becoming ill at the next cycle, and a probability of 0.01 of dying at the next cycle. If they move to a different state, their transition probabilities will change.
The states are usually also such that you can only go in one direction - in this model, everyone will eventually end up in the ‘dead’ state, if we run the model for long enough. This is known as an absorbing state.
We can represent the model using the information in the transition matrix as in Figure 2.1.
plot(mm_simp_heemod)
Figure 2.1: Our simple model.
The transition probabilities do not reflect how long someone has been in their state, but only which state they are in. This ‘memoryless’ behaviour is known as the Markov property.
Let’s say our model begins with a cohort of 1000 healthy people. Therefore at time \(t=1\) we have
## healthy ill dead
## 1 1000 0 0
To calculate the populations in each state at time 2, we multiply the populations at time 1 by the transition matrix:
\[ \begin{pmatrix} 1000 & 0 & 0 \end{pmatrix} \begin{pmatrix} 0.97 & 0.02 & 0.01 \\ 0 & 0.96 & 0.04 \\ 0 & 0 & 1 \end{pmatrix} = \begin{pmatrix} 970 & 20 & 10 \end{pmatrix} \]
That is, at time 2 we expect to have 970 of the 1000 people still in the healthy state, 20 in the ill state and 10 in the dead state, and therefore we have
## healthy ill dead
## 1 1000 0 0
## 2 970 20 10
We can carry this on for many cycles, each time multiplying the state population vector from the previous time by the transition matrix. In Markov models, a cycle is usually a fairly long period of time, commonly a year.
## healthy ill dead
## 1 1000.0000 0.00000 0.00000
## 2 970.0000 20.00000 10.00000
## 3 940.9000 38.60000 20.50000
## 4 912.6730 55.87400 31.45300
## 5 885.2928 71.89250 42.81469
## 6 858.7340 86.72266 54.54332
## 7 832.9720 100.42843 66.59956
## 8 807.9828 113.07073 78.94642
## 9 783.7434 124.70756 91.54908
## 10 760.2311 135.39413 104.37482
In this way, we can model the number of people from the cohort who will be in each state as time progresses.
Because there is a non-zero probability of remaining in each of the states, the counts will never absolutely arrive at (0, 0, 1000), where everyone is in the ‘dead’ state. We therefore need to employ a stopping criterion to determine the number of cycles for which we will run the model. In this workshop, we will stop when less than 0.5 people are not in the ‘dead’ state. That is, when the count in the ‘dead’ state rounds up to the full total. Other stopping criteria are described in Sonnenberg and Beck (1993).
We can use the cohort simulation we developed, along with information about costs and QALYs, to generate total cost and QALYs for the cohort.
For our simple example, let’s say our costs and QALYs are
= data.frame(
costs_qalys_df State = c("healthy", "ill", "dead"),
Cost = c(40, 1000, 0),
QALY = c(0.95, 0.6, 0)
) costs_qalys_df
## State Cost QALY
## 1 healthy 40 0.95
## 2 ill 1000 0.60
## 3 dead 0 0.00
At the first timestep we have 1000 people all in the ‘healthy’ state. They are therefore incurring a cost (over that timestep) of \(1000 \times{40} = 40000\) (in whatever units our costs are) and gaining \(1000\times{0.95}=950\) QALYs. Therefore the total cost and QALYs at our first time point are
## costs QALYs
## 1 40000 950
In general we can use matrix multiplication again to calculate the costs and QALYs at each time-step using the values we defined. For example, for timestep 2:
\[ \begin{pmatrix} 970 & 20 & 10 \end{pmatrix} \begin{pmatrix} 40 & 0.95 \\ 1000 & 0.6 \\ 0 & 0 \end{pmatrix} = \begin{pmatrix} 58800 & 933.5 \end{pmatrix} \] We can work through our cohort simulation outcome to find the costs and QALYs for each cycle.
## costs QALYs
## 1 40000.00 950.0000
## 2 58800.00 933.5000
## 3 76236.00 917.0150
## 4 92380.92 900.5638
## 5 107304.21 884.1637
## 6 121072.02 867.8309
## 7 133747.31 851.5805
## 8 145390.05 835.4261
## 9 156057.30 819.3807
## 10 165803.37 803.4560
By summing each column and diving by the number of people in the cohort (1000 in this example) we can find the average cost per person and average QALYs accrued per person over the cycle.
= sum(cohort_outcome$costs) / 1000
cost_per_person = sum(cohort_outcome$QALY) / 1000
QALYs_per_person sprintf("Average cost per person is %g", cost_per_person)
## [1] "Average cost per person is 1096.79"
sprintf("Average QALYs accrued per person is %g", QALYs_per_person)
## [1] "Average QALYs accrued per person is 8.76292"
heemod
In this workshop we will use the package heemod
to specify and run Markov models.
heemod
First of all we will define each state, in terms of its cost and effect. These values are the costs and QALYs per time-step. We can define these in R in the heemod
package via
# define costs and outcomes associated with each state
= define_state(
healthy_state cost = 40,
QALYs = 0.95
)= define_state(
ill_state cost = 1000,
QALYs = 0.6
)= define_state(
dead_state cost = 0,
QALYs = 0
)
To run this in heemod
, we first collect the information into a strategy (this name makes more sense when comparing strategies) and then run the model. The default will be to have 1000 people starting in the first state (‘healthy’ in our case):
## We've already defined the transition matrix mm_simp_heemod,
## but we'll put it here again:
<- define_transition(
mm_simp_heemod state_names = c("healthy", "ill", "dead"),
90, .08, 0.02,
.0, 0.88, 0.12,
0, 0, 1
)
## the strategy combines the transition matrix with the information about each state
= define_strategy(
strat_simp transition = mm_simp_heemod,
healthy = healthy_state, #[<name in transition matrix> = <defined state>]
ill=ill_state,
dead=dead_state
)
# We can now run this model with the information in the strategy
# for a number of cycles
# In our model cost is one number, but it can be split into
# `cost_drugs` and `cost_health` to separate the cost of treatment
# from the general healthcare cost
# we also define the effect (in our case in QALYs, which is what we've used
# when defining the states)
= run_model(
res_simp sim = strat_simp,
cycles = 87,
cost = cost,
effect= QALYs
)## See the first few rows of the counts in each state
head(get_counts(res_simp))
## .strategy_names markov_cycle state_names count
## 1 sim 1 healthy 950.0000
## 2 sim 2 healthy 855.0000
## 3 sim 3 healthy 769.5000
## 4 sim 4 healthy 692.5500
## 5 sim 5 healthy 623.2950
## 6 sim 6 healthy 560.9655
## See the counts for all cycles:
# get_counts(res_simp)
## see the first few rows of the costs and QALYs for each state
head(get_values(res_simp))
## markov_cycle .strategy_names value_names value
## 1 1 sim cost 78000.0
## 2 2 sim cost 145400.0
## 3 3 sim cost 197036.0
## 4 4 sim cost 235567.3
## 5 5 sim cost 263257.2
## 6 6 sim cost 282028.6
# Plot the populations of each state over time
plot(res_simp)
summary(res_simp)
## 1 strategy run for 87 cycles.
##
## Initial state counts:
##
## healthy = 1000L
## ill = 0L
## dead = 0L
##
## Counting method: 'life-table'.
##
##
##
## Counting method: 'beginning'.
##
##
##
## Counting method: 'end'.
##
## Values:
##
## cost QALYs
## sim 7043120 13021.95
The reason for building a Markov model in health economics is to assess which treatment is the most cost-effective. Therefore, let’s see how we would do this. We will also explore some more detail in the heemod
functions.
Suppose we are performing a cost-effectiveness analysis for two treatments of a disease, ‘drugA’ (the current standard) and ‘drugB’ (a newer, more effective but more costly treatment).
In this illness, there are three states: ‘mild’, ‘severe’ and ‘dead’. The illness is chronic, and there is no returning from ‘severe’ to ‘mild’.
The transition matrix while on drug A is
\[\begin{pmatrix} 0.75& 0.23 & 0.02\\ 0 & 0.9 & 0.1\\ 0 & 0 & 1 \end{pmatrix}\]
We can specify this in R as follows:
= define_transition(
tm_drugA state_names = c("mild", "severe", "dead"),
0.75, 0.23, 0.02,
0, 0.9, 0.1,
0, 0, 1
) tm_drugA
## A transition matrix, 3 states.
##
## mild severe dead
## mild 0.75 0.23 0.02
## severe 0.9 0.1
## dead 1
It is thought that while on drug B, the probability of moving to a worse state is generally less. So that we can specify this clearly, and potentially investigate the effect of how much these probabilities are reduced, we represent this reduction with a parameter. Since this sets the risk for one treatment relative to those for the other treatment, we call this the relative risk \((rr)\). So, to find the transition probabilities for ‘drugB’ we multiply the transition probabilities (ie. the off-diagonal entries in the transition matrix, which represent moving to a different state) by the relative risk. This then automatically increases the diagonal entries (the probabilities of remaining in the same state).
For our example, our transition matrix for drug B would be
\[\begin{pmatrix} C& 0.23rr & 0.02rr\\ 0 & C & 0.1rr\\ 0 & 0 & 1 \end{pmatrix}\]
Where \(C\) is the complement - the value necessary for the probabilities to sum to one.
In the code below, we define relative risk rr
, and then multiply the probabilities of transitioning to worse states while on drug A by rr
to find the probabilities for drug B. The C
term on the diagonal specifies the complement - one minus the sum of the other probabilities in that row. This ensures that the rows will sum to one.
= 0.7 ## Our relative risk
rr = define_transition(
tm_drugB state_names = c("mild", "severe", "dead"),
0.23*rr, 0.02*rr,
C, 0, C, 0.1*rr,
0, 0, 1
)
tm_drugB
## A transition matrix, 3 states.
##
## mild severe dead
## mild C 0.23 * rr 0.02 * rr
## severe C 0.1 * rr
## dead 1
In this example we will split the costs of drugs and ongoing healthcare, and this will make defining our states slightly more complicated. We will also have to specify costs for each drug for each state - we will see how to do this shortly.
In the following, we also use the function discount
. Discounting is a general health economic principle, where something’s value in the future is worth less than it is now. For example here, the ongoing health costs depreciate by 3% each year. Discounting can also be applied to the effect, for example to QALYs.
In
heemod
we can make use of the functiondispatch_strategy
, which specifies a different return value for each strategy (each drug, in this example). We will only use this to distinguish between the drug costs for the different strategies, but it can be used in any place where the value returned depends on the strategy. The name used needs to be the same as inrun_model
.
# setting up some initial cost variables
= 2000
cost_drugA = 2700
cost_drugB
= define_state(
state_mild# health and drugs costs decrease by a factor of 0.03 (3%) each year
cost_health = discount(1500, 0.03),
cost_drugs = discount(
dispatch_strategy(
drugA = cost_drugA,
drugB = cost_drugB),
0.03),
cost_total = cost_health + cost_drugs,
QALYs = 0.9 # QALYs is our choice of effect here.
## We could use a different measure, so long as we linked this to the
## 'effect' argument in run_model.
)
= define_state(
state_severe cost_health = discount(3000, 0.03),
cost_drugs = discount(
dispatch_strategy(
drugA = cost_drugA,
drugB = cost_drugB),
0.03),
cost_total = cost_health + cost_drugs,
QALYs = 0.6
)
= define_state(
state_dead cost_health = 0,
cost_drugs = 0,
cost_total = 0,
QALYs = 0
)
So, in this model, the different drugs are distinguished by:
We now combine this information together to generate our two Strategies. The information distinguishing drug A from drug B is contained in the different transition matrices, and using the dispatch_strategy
function while defining the states.
= define_strategy(
strat_drugA transition = tm_drugA,
mild = state_mild,
severe = state_severe,
dead = state_dead
)
= define_strategy(
strat_drugB transition = tm_drugB,
mild = state_mild,
severe = state_severe,
dead = state_dead
)
Now we can run the model, to compare the different drugs:
<- run_model(
res_drugsAB drugA = strat_drugA,
drugB = strat_drugB,
cycles = 143,
cost = cost_total,
effect = QALYs
)
and summarize the results. First of all, we check that our cycle length is such that both final ‘dead’ counts round to 1000, then change the ‘cycles’ number in run_model
.
We can see (at the bottom) that in this case, the incremental cost-effect ratio (ICER) is £5890, meaning that drug B is cost-effective if our willingness-to-pay threshold is £5890 or greater.
summary(res_drugsAB)
## 2 strategies run for 143 cycles.
##
## Initial state counts:
##
## mild = 1000L
## severe = 0L
## dead = 0L
##
## Counting method: 'life-table'.
##
##
##
## Counting method: 'beginning'.
##
##
##
## Counting method: 'end'.
##
## Values:
##
## cost_health cost_drugs cost_total QALYs
## drugA 24645721 19649231 44294952 8669.997
## drugB 31508935 34547463 66056398 12578.177
##
## Efficiency frontier:
##
## drugA -> drugB
##
## Differences:
##
## Cost Diff. Effect Diff. ICER Ref.
## drugB 21761.45 3.908179 5568.18 drugA
There are also several useful plotting functions in heemod
, which can help us to visually compare the effects of our different strategies.
We can compare the counts in each state grouped by strategy:
plot(res_drugsAB, type = "counts", panel = "by_strategy")
or grouped by state:
plot(res_drugsAB, type = "counts", panel = "by_state")
We can also plot the costs and effects over time:
plot(res_drugsAB, type = "values", panel = "by_value",
free_y = TRUE)
These are all built in ggplot2
, so if you would like to change the appearance in any way (eg. axis labels, colours) you can use the usual ggplot2
syntax.
Implement the model described in this section in R using heemod
.
Chancellor et al. (1997) use Markov modelling to compare two treatments for HIV. We will focus on the modelling detail to recreate their analysis. The two treatments were:
In this model, one cycle is one year.
In this model, the patients can be in one of four possible states:
For monotherapy, the transition matrix was estimated to be
\[\begin{pmatrix} 0.721 & 0.202 & 0.067 & 0.010\\ 0 & 0.581 & 0.407 & 0.012\\ 0 & 0 & 0.75 & 0.25\\ 0 & 0 & 0 & 1 \end{pmatrix}\]
For combination therapy, the probability of transitioning to worse states was estimated to be 0.509 (we call this the relative risk).
The study by Chancellor et al. (1997) uses life-years as the effect, with no measure of HRQoL. Therefore the effect
of being in any of the states (apart from D) for a cycle is 1.
The cost of zidovidune is £2278 per patient per year, and the cost of lamividune is £2086 pppy. Patients on combination therapy take both drugs, so the cost is the sum.
The costs of healthcare per person per year in each state were estimated to be:
Compare monotherapy and combination therapy using a Markov model implemented in heemod
.
The key clinical assumption in this (and many) HIV models is that patients can only progress through the states, and cannot return to a less severe state. In more recent models this is not the case, and patients can return from the AIDS state and can have an increasing CD4 count.
If you’d like to see some more realistic examples of Markov models being used in health economics, the following papers might be useful: