In this practical we will experiment with the exponential distribution which is commonly used to model the time to the occurrence of some event. Such events may be death of a person, failure of a piece of equipment, development of (or remission) of symptoms, health code violation (or compliance) etc. The times to the occurrences of events are known as “lifetimes” or “survival times” or “failure times” according to the event of interest in the fields of study.
c
, seq
,
length
plot
which
command.dexp
and gamma
functions.function
with multiple arguments as input.This is a part of a data set of waiting times (in minutes) before service of 30 Bank customers and examined and analyzed by Ghitany et al. (Lindley distribution and its application, Mathematics and Computers in Simulations, 2008).
Times | 0.8 | 0.8 | 1.3 | 1.5 | 1.8 | 1.9 | 1.9 | 2.1 | 2.6 | 2.7 |
Times | 2.9 | 3.1 | 3.2 | 3.3 | 3.5 | 3.6 | 4.0 | 4.1 | 4.2 | 4.2 |
Times | 4.3 | 4.3 | 4.4 | 4.4 | 4.6 | 4.7 | 4.7 | 4.8 | 4.9 | 4.9 |
As usual, we begin in R by entering the data and examining summaries of the data, and plots of the data, initially.
x = c(0.8, 0.8, 1.3, 1.5, 1.8, 1.9, 1.9, 2.1, 2.6, 2.7, 2.9, 3.1, 3.2, 3.3, 3.5, 3.6, 4.0, 4.1, 4.2, 4.2, 4.3, 4.3, 4.4, 4.4, 4.6, 4.7, 4.7, 4.8, 4.9, 4.9)
summary(x)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.800 2.225 3.550 3.317 4.375 4.900
sd(x)
## [1] 1.295638
hist(x, col = 2, main = 'Waiting times (minutes) in banks')
The exponential distribution as a likelihood.
The pdf of the exponential distribution is \[
f(x|\lambda) = \lambda e^{-\lambda x}
\]
for \(\lambda >0\) and \(x\in[0,\infty)\).
The Bayes factor for simple vs. simple hypotheses. As we have seen, for these simple tests finding the Bayes factor \(B_{01}\) in favour of \(H_0\) against \(H_1\) reduces to computing the likelihood ratio. Then finding posterior probabilities also becomes easy as we have the formula. So in the particular case of the exponential we have that the Bayes factor of \(H_0: \lambda = \lambda_0\) vs. \(H_1: \lambda = \lambda_1\) is simply \[ B_{01} = \left(\frac{\lambda_0}{\lambda_1}\right)^n e^{-(\lambda_0-\lambda_1)\sum_{i=1}^nx_i} \] You can check this by yourself later for practice. So, it easy to derive the Bayes factor and then we can also derive posterior probabilities from \[ p_0 = \left[1+\frac{\pi_1}{\pi_0}B_{01}^{-1}\right]^{-1}, \] which of course depends on the prior probabilities.
The gamma distribution as a conjugate likelihood. As we know the conjugate prior for an expontial \(x\) is the gamma distribution: \[ f(\lambda) = \frac{\beta^\alpha}{\Gamma(\alpha)}\lambda^{\alpha-1} e^{-\beta \lambda} \]
l0
,
l1
for the prior hypotheses, n
for sample
size, and S
for the sum of the data.Of course, in this simple case we can “cheat” a bit, and use the
build-in R function dexp
for calculating the ratio. Type
`?dexp’ to see how this function works.
dexp
(type ?dexp
) and
taking the product of the result via function prod
.It would be desirable to perform multiple comparisons in the region \([0,1]\) under the alternative. However, repeating the above procedure for various values of \(\lambda_1\) is not convenient based on the previous approach. For this reason we will first define our own function in R.
f
with arguments data
, l0
and
l1
that can be used for any given data and set of point
hypotheses. The function should provide as end result \(2\log B_{10}\).data
and
l0
and use the command seq
for defining a grid
of values for the argument l1
.?seq
to remember how to use this command. Then
define an object grid1
based on seq
which will
be an equally spaced grid of 100 values from 0 to 1.logB10
, use
gridl1
as input to your function f
using the
same data
and l0
as before.minl1
and maxl1
,
respectively. Hint: You can use the which
, command
which we used in the previous practicals.
abline
to highlight horizontally where
\(2\log B_{10} >1\) and vertically
the minimum and maximum \(\lambda_1\)s
as the boundaries where this happens.The Bayes factor for simple vs. composite hypotheses. As we have seen (or will see), for these type of tests the Bayes factor is given by \[ B_{01} = \frac{f(x|\theta=\theta_0)}{\int_{\theta\in\Omega} f(x|\theta)f(\theta)\mathrm{d}\theta}. \]
The gamma distribution as a conjugate prior. As we know the conjugate prior for the exponential is the gamma distribution: \[ f(\lambda) = \frac{\beta^\alpha}{\Gamma(\alpha)}\lambda^{\alpha-1} e^{-\beta \lambda}. \]
The Bayes factor of the Exponential-Gamma model. It can be shown with some mathematical derivations that the resulting Bayes factor for our model in the simple vs. composite case is given by \[ B_{01} = \frac{\lambda_0^n \Gamma(\alpha)(\sum_i x_i + \beta)^{n+\alpha}}{e^{\lambda_0\sum_i x_i}\Gamma(n+\alpha)\beta^\alpha}. \]
f2
that takes as input
data
, l0
, alpha
and
beta
and gives as output the posterior probability of \(H_0\) (use the formula shown in Section 2).
Use the gamma
function (type ?gamma
)l0=0.3
.f3
that takes as further input the
additional argument pi0
for the prior probability of \(H_0\) and produces again as output the
posterior probability of \(H_0\).f3
to find
the posterior probability of the null.Under the assumption \(\alpha =2\), \(\beta=4\), we essentially “centered” the prior around the \(H_0\), because the prior expectation and variance from the gamma prior are given by \[ E(\lambda) = \frac{\alpha}{\beta}=\frac{2}{4}=0.5, ~~ Var(\lambda) = \frac{\alpha}{\beta^2}=\frac{2}{16}=0.125. \]
f1
using these
hyperparameters.