MATH39512 Survival Analysis for Actuarial Science: example sheet 4
*=easy, **=intermediate, ***=difficult
* Exercise 4.1
Consider the following observed values of the survival times of 8 independent homogeneous individuals, where + denotes a censored value:
1.5+ 2.5 4.5 5+ 7+ 8 12+ 14.
Assume the following parametric form. of the common hazard function of the individuals:
µ(t) = µ(t; α, λ) = αλα
t
α−1
, t ≥ 0,
with parameters α, λ > 0. Assume the (joint) distribution times at which individuals are censored do not depend on the parameters α and λ and that the type of censoring is random censoring in this case. Find an explicit expression (in term of α and λ) for the log-likelihood function of the parameters α and λ given the observations. Here any constants that do not effect the maximum likelihood estimation can be ignored.
** Exercise 4.2
The data below gives the failure times of two groups, in days, where + indicates a censored failure time:
Treatment group: 6, 6, 6, 6+, 7, 9+, 10, 10+, 11+, 13, 16, 17+
19+, 20+, 22, 23, 25+, 32+, 32+, 34+, 35+
Control group: 1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11
12, 12, 15, 17, 22, 23
Assume a setup as in Section 3.1 of the notes and assume the exponential distribution is a valid distribution for each group.
(a) Calculate the maximum likelihood estimate (mle) of the parameter in each group and test whether there is a difference in the distributions of the two groups.
(b) Find an approximate 95% confidence interval for the parameter of the exponential distribution corresponding to group 1 by using the mle of this parameter.
(c) Use the score test to test, at significance level 5%, whether the parameter of the exponential distribution corresponding to group 2 is equal to 0.05 or not.
*** Exercise 4.3
Consider a group of n individuals whose genuine failure times T1, T2, . . . , Tn are i.i.d. with common hazard function µ(t) which is of a parametric form. depending on a vector θ of unknown parameters. Let C1, C2, . . . , Cn be the censoring times of the n individuals and assume there is random censoring and the joint distribution of the censoring times does not depend on θ. Allowing for truncation, let ai ≥ 0 be the time at which individual i’s survival time is truncated (with the understanding that ai = 0 means that individual i’s survival time is not truncated). Let Tei = Ti ∧ Ci and ∆i = 1{Ti≤Ci}.
(a) Assume in addition that the random variables T1, C1, . . . , Tn, Cn are independent. Show that, given the data the log-likelihood of θ is given by
where A(t) = R 0
t
µ(s)ds and K ∈ R does not depend on θ.
(Hint: due to the extra assumption the likelihood L is equal to Q n i=1 Li where Li
is the contribution to the likelihood of individual i. In order to determine Li work with a conditional joint probability function where one conditions on the genuine failure and censoring time of individual i being bigger than the time at which individual i’s survival times is truncated.)
(b) Show that the log-likelihood formula in the previous part still holds if the extra assumption in part (a) is removed. (Hint: work with a conditional joint probability function and then follow the steps of the proof of Theorem 3.1 in the notes.)
(c) The data below gives the residual lifetimes (expressed in age in years) of a group of homogeneous independent individuals. Here some of the survival times are left-truncated and some of them are right-censored.
(60.0, 73.2] (62.3, 69.7+] (63.5, 74.9+] (64.8, 72.2] (65.5, 69.7]
(66.0, 79.6] (72.0, 82.0+] (74.1, 79.6] (74.5, 83.8+] (75.6, 77.8]
Assume that the common force of mortality associated with these lives is of the form.
µ(t) = 2βt, t ≥ 0,
where β > 0 is unknown and t is time/age in years. Assuming a setup as in Section 3.1 of the notes but now allowing for left-truncation, determine the maximum likelihood estimate of β for this data set.
** Exercise 4.4
(a) Show that, if A(t) is the cumulative hazard function of
(i) the exponential distribution with parameter λ, then A(t) = λt;
(ii) the Weibull(λ, α) distribution, then log(A(t)) = α log λ + α log t (recall that the hazard function in this case is given by µ(t) =αλαtα−1);
(iii) the log-logistic(λ, α) distribution, then log (eA(t) — 1) α log λ + α log t (recall that the cumulative hazard function in this case is given by A(t) = log (1 + (λt)
α));
(b) Suppose that you have data available from a study on the survival times of n indi-viduals, some of whom are censored, which enable you to construct the Nelson-Aalen estimate of the cumulative hazard function of the survival time distribution. Can you suggest graphical methods of checking whether the survival times follow the exponen-tial or the Weibull or the log-logistic distribution? Explain also, in the event of any of these three graphical methods indicating that one of the three distributions is an appropriate distribution for the failure times, how you can obtain rough estimates of the parameters of the indicated distributions. (Hint: use the equations established in part (a).)