Models and Methods in Health Data Science

Lectures 7 & 8: Decision Theory and Diagnostic Testing

Rachel Oughton

22/02/2023

1 Overview

In this lecture we will

2 Decision Theory

An important aspect of health economics problems is that a decision needs to be made in the presence of uncertainty.

This is often done using Decision theory.

Decision Theory

A decision analytic model uses


As well as the decisions we make, there are events that are random


This information is combined to deduce the best decision.

Decision Theory

By combining

we can calculate the expected outcome and cost associated with each sequence of decisions.


In general in decision theory we express the outcome in terms of utility.

You can read about the link between utility and the QALY in Whitehead and Ali (2010).

Uncertainty

In health economics, the decisions relate to efficacy and resource allocation.


There is uncertainty inherent in these decisions:

Health economics: the big questions

In health economic decision analysis there are two basic questions:

  1. Should this treatment / technology / intervention etc. be adopted, given the existing evidence and the uncertainty surrounding its outcomes? - If so, which strategy should be adopted, and for which cohort(s)?

  2. Is more evidence / information required before question 1 can be answered?

We will look at some aspects of each of these.

2.1 Summary

Decision theory addresses the question

“How should one decide what to do when things are uncertain?”.

In decision theory, we

We can also use decision theory to answer questions like “How much should we be willing to pay for more information?”

2.2 The ingredients of a decision problem

The ingredients of a decision problem are:

We will illustrate this throughout with the simple example of a lottery ticket.

2.2.1 Decisions

The first task is to list all the possible decisions you can make, often written \(d_1,\ldots,d_k\).

For example, let’s say you’re deciding whether to buy a lottery ticket. Then we might have \[ \begin{align*} d_1: & \text{ Buy a ticket} \\ d_2: & \text{ Don't buy a ticket} \end{align*} \]

In this example, there is only one ‘phase’ of decision-making.


In many problems there might be several points at which a decision needs to be made, and options might depend on what has happened before.

2.2.2 Events

The second set of ingredients is the events, often labelled \(E_1,\ldots,E_m\).


We use this in the probabilistic sense, to mean the set of possible things that might happen after you have made your decision (or between subsequent decisions, if there are multiple phases).


Make sure that exactly one of the events will happen.

In our example about buying a lottery ticket, we might have:

\[ \begin{align*} E_1: & \text{ You win} \\ E_2: & \text{ You don't win} \end{align*} \]

When we make the decision, we don’t know which event will occur.

Uncertainties

At each point where one of a set of events will occur, we need the probability of each event.


Often based on information from studies / trials / research.

In our lottery example, we might have:

\[ \begin{align*} p\left(E_1\mid{d_1}\right) & = 0.0001 \text{(winning given you bought a ticket)} \\ p\left(E_1\mid{d_2}\right) & = 0 \text{ (winning given you didn't buy a ticket)} \end{align*} \]

2.2.3 Rewards / payoffs

The rewards (or payoffs) are the consequences following each combination of decisions and events.

Note that although these are called ‘rewards’, they can sometimes be bad!

In health economics, these will often be expressed in QALYs.

2.2.4 Costs

Linked to rewards is the idea of costs.

Usually, each decision will incur a cost.

In our example let’s say the cost of buying a lottery ticket is £1, and the reward if you win is £500.

In health economics, there may be other costs associated with outcomes, eg. ongoing treatment.

2.3 The decision tree

We can put everything together into a decision tree.

A decision tree is made of nodes and branches, arranged to show

Nodes

There are two types of node:


At each node there are then a number of branches, depending on the number of possible events or decisions.


For any combination of decisions and events, we end up at a reward / outcome.

It is important that the pathways are mutually exclusive.

Lottery example: decision tree

The decision tree for our lottery example.

The decision tree for our lottery example.

Time goes from left to right

2.4 Solving the Decision tree

The point of the decision tree was to help us to make a decision.

So, how do we use it to do that?

To solve a decision tree we

  1. Combine the probabilities and outcomes to find the expected value of each decision.
  2. Choose the decision(s) that lead to the best expected outcome

At each decision node, we rule out the options with the lower expected values, leaving one path:
The path with the best expected outcome

2.4.1 Backwards induction

We construct the tree from left (the first decision node) to right (the final outcomes)

We solve the tree from right to left.

We work through the tree, removing each node as we go, from right to left.

When we have reached the root decision node, we will have found

2.4.2 Lottery example

For our lottery ticket example, we have

The chance node is the furthest to the right, so we will start there.

At the chance node, we have

Therefore our expected outcome (in £) is

\[0.0001 \times{499} + 0.9999 \times{-1} = -0.95 \]

Lottery example

Our tree therefore becomes

The decision tree for our lottery example, with the chance node solved.

The decision tree for our lottery example, with the chance node solved.

Lottery example

We can now look at the two branches from the decision node.

Therefore we cross out \(d_1\) and our optimal decision is not to buy a ticket.

The decision tree for our lottery example, with the chance node solved.

The decision tree for our lottery example, with the chance node solved.

2.5 Expected value of perfect information (EVPI)

We can make decisions, but we don’t know what will happen at each chance node.

However, suppose we have an option to find out what will happen at each chance node

this would almost certainly have a big impact on our decision-making!

Expected value of perfect information (EVPI)

The expected value of perfect information is the difference in

EVPI is calculated from the perspective of deciding whether or not to pay to gain the perfect information

Lottery example: EVPI

For our lottery example:

The probabilities of the information revealing each event are the same as the probabilities of the event.

So our expected outcome with perfect information is:

\[ 0.0001\times{£499} + 0.9999\times{£0} = £0.0499.\] Our optimal decision before (not to buy a ticket) had expected outcome £0,

So our expected value of perfect information is

\[ EVPI = £0.0499 - £0 = £0.0499. \]

3 Decision analysis for treatments

In this section:

3.1 Example: Angina operation

Williams (1985) presents several scenarios for a certain class of angina patients

  • coronary artery bypass grafting (an operation)
  • standard ongoing medical management.

We will go through a simplified version, but the paper is a useful reference.

Possible outcomes from the operation in QALYs, taken from @williams1985economics.

Possible outcomes from the operation in QALYs, taken from Williams (1985).

Angina example

These are the probabilities of those outcomes if the operation is performed:

The decision tree for our angina example. The QALY values have been estimated by eye from @williams1985economics.

The decision tree for our angina example. The QALY values have been estimated by eye from Williams (1985).

Angina example: backward induction

We can solve this with backward induction as before.

For the chance node following the operation, we have an expected outcome of

\[ \underbrace{0.67\times{9}}_{\text{Improvement}} + \underbrace{0.3\times{4.4}}_{\text{No change}} + \underbrace{0.03 \times{0}}_{\text{Operative} \\ \text{mortality}} = 7.35 \text{ QALYs.}\]

The expected outcome of staying on the medical management plan is 4.4 QALYs.

Therefore the optimal decision is to have the operation.

Incorporating costs

So far we have just considered QALYs.


In fact, each decision option has associated with it a cost.

In this case:

Therefore our incremental cost efficiency ratio is

\[ ICER = \frac{3000-500}{7.35 - 4.4} = \frac{2500}{2.95} = 847.46. \] and if our willingness-to-pay threshold is above £847.46 per QALY, we will perform the operation.

Angina example: EVPI

How much would we pay in this case for perfect information?

Our outcome with perfect information would be

\[\underbrace{0.67\times{9}}_{\text{Operation successful}} + \underbrace{0.3\times{4.4}}_{\text{No change}} + \underbrace{0.03\times{4.4}}_{\text{Operative mortality}} = 7.482.\]

This is \(7.482 - 7.35 = 0.132\) QALYs more than our expected outcome.

Therefore we will pay up to 0.132 times our willingness-to-pay-threshold for perfect information.

A much more realistic (complex!) EVPI example in McCullagh, Walsh, and Barry (2012).

4 Diagnostic testing

In this section:

Diagnostic testing

The aim for performing diagnostic tests is to

We want to avoid

Diagnostic tests

For a given condition there may be a range of different diagnostic tests, each with different costs and accuracies.

Think of the more costly but more accurate PCR test compared to the less expensive but less accurate LFD test.

Rautenberg, Gerritsen, and Downes (2020) give a review of the use of decision theory in diagnostic testing, as well as setting forward good practice.

Notation

We will use

For example, \(T^-\) means that a test is negative, whereas \(D^+\) means the disease/condition is present.

4.1 Measures of test accuracy

Two measures are important when considering diagnostic test accuracy:

As sensitivity increases, specificity often decreases

At the extreme,

Clearly this would not be a very useful test.

4.1.1 The Confusion Matrix

We can visualise these measures using a confusion matrix.

Test_Positive Test_Negative
Disease True positives (TP) False negatives (FN)
No Disease False positives (FP) True negatives (TN)

4.2 Predictive value

What we really want to know is

\[p\left(D\mid{T}\right)\] The probability someone has the disease, given their test result.


We use Bayes theorem

For two events \(A\) and \(B\), with \(p\left(B\right)>0\),

\[ p\left(A\mid{B}\right) = \frac{p\left(B\mid{A}\right)p\left(A\right)}{p\left(B\right)}.\]

Bayes theorem for predictive value

For us, this becomes (for example)

\[p\left(D^+\mid{T^+}\right) = \frac{p\left(T^+\mid{D^+}\right)p\left(D^+\right)}{p\left(T^+\right)}.\] Therefore we need:

Bayes theorem for predictive value

Two more measures of diagnostic accuracy:

Because of the dependence on the prevalence, these quantities may need to be re-calculated often.

Example: calculating predictive value

Suppose a diagnostic test for a particular disease has sensitivity 0.99 and specificity 0.8. That is, \[p\left(T^+\mid{D^+}\right) = 0.99\] and \[p\left(T^-\mid{D^-}\right)=0.8.\] The prevalence of the disease in the population is 1%, that is \[ p\left(D^+\right) = 0.01.\]

Example: calculating predictive value

We first need to calculate \(p\left(T^+\right)\), using partition theorem:

\[\begin{align*} p\left(T^+\right) & = p\left(T^+\mid{D^+}\right)p\left(D^+\right) + p\left(T^+\mid{D^-}\right)p\left(D^-\right)\\ &= 0.99 \times{0.01} + \left(1-p\left(T^-\mid{D^-}\right)\right)p\left(D^-\right)\\ & = 0.99 \times{0.01} + 0.2 \times{0.99}\\ & = 0.0099 + 0.198\\ & = 0.2079. \end{align*}\]

4.2.1 Exercise

  1. Take a guess at what you think \(p\left(D^+\mid{T^+}\right)\) and \(p\left(D^-\mid{T^-}\right)\) will be.
  2. Use the values given and Bayes theorem to calculate them.

Reminder

\[\begin{align*} p\left(T^+\mid{D^+}\right) & = 0.99\\ p\left(T^-\mid{D^-}\right)& =0.8\\ p\left(T^+\right) & = 0.2079 \\ p\left(D^+\right) & = 0.01 \end{align*}\]

Bayes Theorem: \[p\left(D^+\mid{T^+}\right) = \frac{p\left(T^+\mid{D^+}\right)p\left(D^+\right)}{p\left(T^+\right)}\]

The prosecutor’s fallacy

In this example Even though \(p\left(T^+\mid{D^+}\right)\) is high, \(p\left(D^+\mid{T^+}\right)\) is low

To see this:

\[\begin{align*} p\left(T^+\right) & = p\left(T^+\mid{D^+}\right)p\left(D^+\right) + p\left(T^+\mid{D^-}\right)p\left(D^-\right)\\ & = \underbrace{0.99 \times{0.01}}_{\text{Very few true positives}} + \underbrace{0.2 \times{0.99}}_{\text{Many false positives}} \end{align*}\]

More generally

The sensitivity \(p\left(T^+\mid{D^+}\right)\) is often confused with the
predictive power \(p\left(D^+\mid{T^+}\right)\).

This is known as the Prosecutor’s fallacy

4.3 Decision trees for diagnostic testing

The decision tree structure is often used as a calculation tool (ie. with no decisions to be made).


We will use this to think about diagnostic tests

Decision trees for diagnostic testing

A 'decision' tree for diagnostics.

A ‘decision’ tree for diagnostics.

4.3.1 Example

Suppose some disease has prevalence \(p\left(D^+\right)= 0.2\), and we know that

We can then fill in the tree as follows:

A disease-based approach.

A disease-based approach.

Decisions for diagnostics

These diagnostics results can be used in two (probably more) ways:

The resulting probabilities can feed into another decision analysis.

For example, a gold standard test could be a source of [not quite] perfect information

4.4 Sequential diagnostic testing

Suppose we have two tests:

In sequential diagnostic testing:

4.4.1 Example continued:

Let’s imagine that the diagnostic test from our example is the first test (\(T_1\)) in a sequential testing plan.

If 1000 people are tested, we expect:

  • 172 to be true positives
  • 28 to be false negatives
  • 240 to be false positives
  • 560 to be true negatives
Test_Positive Test_Negative
Disease 172 28
No Disease 240 560


Only the \(172 + 240 = 412\) with a positive result would be sent for the second test \(T_2\).

Note that even with this rather high disease prevalence of 0.2, less than half of the positive tests are correct.

Example - a second test

Let’s say that for the second test \(T_2\) we have

We assume here that the results of \(T_1\) and \(T_2\) are conditionally independent given the disease state

Example - sequential testing

Sequential testing.

Sequential testing.

Example: sequential testing

From this we can compare the results after just the first test with the results following both tests.

Test 1 only.
D_pos D_neg
T1_pos 172 240
T1_neg 28 560
Sequential testing.
D_pos D_neg
Test_pos 163.4 48
Test_neg 36.6 752

Using the sequential testing

Whether this is acceptable would depend on:

These calculations often feed into health economic models.

More of this, and some of the difficulties surrounding decision making with diagnostic tests, in Sutton et al. (2008).

5 Receiver-operating characteristic (ROC) analysis

An important topic in diagnostics (and classification generally)!

In this section:

Receiver-operating characteristic (ROC) analysis

So far we have assumed a diagnostic test outputs positive or negative.
In most cases, a measurement (from which a test is derived) gives a continuous value

Developed during WW2 to assess accuracy of radar operatives.

Continuous test measurements

Suppose we take some measurement from a number of people.
Some of the people have disease D, the others don’t.

Probability distributions of a measurement for people with (D) and without (No D) a disease.

Probability distributions of a measurement for people with (D) and without (No D) a disease.

Continuous test measurements

Suppose we set our decision threshold \(T=0\)

We see

New threshold

Now we set \(T=0.5\)

ROC space

Any value of \(T\) produces a pair (Sensitivity, Specificity), which we can plot

ROC curve

If we vary the decision threshold continuously, we can produce a ROC curve:

The ROC curve for the measurement shown in Figures 5.1-5.3. AUC is 'area under the curve', an idea we will explore shortly.

The ROC curve for the measurement shown in Figures 5.1-5.3. AUC is ‘area under the curve’, an idea we will explore shortly.

ROC curve

Each point on the ROC curve corresponds to a value of the decision threshold

ROC curve

Each point on the ROC curve corresponds to a value of the decision threshold

ROC curve

Each point on the ROC curve corresponds to a value of the decision threshold

ROC curve

Each point on the ROC curve corresponds to a value of the decision threshold

ROC curve

Each point on the ROC curve corresponds to a value of the decision threshold

ROC curve

Each point on the ROC curve corresponds to a value of the decision threshold

ROC curve

Each point on the ROC curve corresponds to a value of the decision threshold

ROC curve

Each point on the ROC curve corresponds to a value of the decision threshold

ROC analysis

We will explore two main aspects of ROC analysis:

5.1 Overall diagnostic performance

The shape of the ROC curve is determined by the degree of separation in the measurements

This shape of ROC curve indicates an ideal diagnostic, and the area under the curve (AUC) is 1 (the best it can be).

Good separation in distributions of a measurement for people with (D) and without (No D) a disease.

Good separation in distributions of a measurement for people with (D) and without (No D) a disease.

The ROC curve for the measurement with good separation.

The ROC curve for the measurement with good separation.

The line with \(\text{AUC}=0.5\) is random guessing or full overlap.

AUC - poor separation

Overlapping measurement distributions - much worse classifier

Poor separation in distributions of a measurement for people with (D) and without (No D) a disease.

Poor separation in distributions of a measurement for people with (D) and without (No D) a disease.

The ROC curve for the measurement with poor separation.

The ROC curve for the measurement with poor separation.

Some values of \(T\) are better than others, but none is perfect.

Area under the curve

In summary:

5.2 Choosing a value for \(T\)

We will often still want to classify someone as ‘positive’ or ‘negative’

This means we need a value for the decision threshold.

The optimal value will depend on:

We will assume an equal balance between sensitivity and specificity, but these methods can be weighted.

5.2.1 Two methods (of many)

Youden’s index:
The best decision threshold according to Youden’s index is the value of \(T\) that maximises \(J(T)\):

\[J\left(T\right) = \operatorname{sensitivity}\left(T\right) + \operatorname{specificity}\left(T\right) - 1\]



Distance from (0,1):
The value of \(T\) that minimizes the distance to the top left corner (the ideal point):

\[D\left(T\right) = \sqrt{\left(1-\text{specificity}\right)^2 + \left(1-\text{sensitivity}\right)^2}.\]

Poor separation example

Our original example

5.3 Example: ROC analysis with data

So far we have been very theoretical. However:

ROC analysis doesn’t rely on any distributional assumptions!

A dataset

100 measurements - coloured by whether each person has the disease.

ROC is not affected by uneven numbers of disease

5.3.1 Example: ROC curve

## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
The ROC curve for our empirical data.

The ROC curve for our empirical data.

We have only skimmed the surface of ROC analysis, but you can read more about it in Zou, O’Malley, and Mauri (2007) and Fawcett (2006).

6 Summary

In this lecture we have studied:

Decision trees

However, there is no natural way to build time into decision trees…


Diagnostic testing

7 References

Fawcett, Tom. 2006. “An Introduction to ROC Analysis.” Pattern Recognition Letters 27 (8): 861–74.
McCullagh, Laura, Cathal Walsh, and Michael Barry. 2012. “Value-of-Information Analysis to Reduce Decision Uncertainty Associated with the Choice of Thromboprophylaxis After Total Hip Replacement in the Irish Healthcare Setting.” Pharmacoeconomics 30 (10): 941–59.
Rautenberg, Tamlyn, Annette Gerritsen, and Martin Downes. 2020. “Health Economic Decision Tree Models of Diagnostics for Dummies: A Pictorial Primer.” Diagnostics 10 (3): 158.
Sutton, Alexander J, Nicola J Cooper, Steve Goodacre, and Matthew Stevenson. 2008. “Integration of Meta-Analysis and Economic Decision Modeling for Evaluating Diagnostic Tests.” Medical Decision Making 28 (5): 650–67.
Whitehead, Sarah J, and Shehzad Ali. 2010. “Health Outcomes in Economic Evaluation: The QALY and Utilities.” British Medical Bulletin 96 (1): 5–21.
Williams, Alan. 1985. “Economics of Coronary Artery Bypass Grafting.” Br Med J (Clin Res Ed) 291 (6491): 326–29.
Zou, Kelly H, A James O’Malley, and Laura Mauri. 2007. “Receiver-Operating Characteristic Analysis for Evaluating Diagnostic Tests and Predictive Models.” Circulation 115 (5): 654–57.