讲解 BIOSTAT 708 Homework 7讲解 Python编程

BIOSTAT 708

Homework 7

assigned on 2/27/2024, due on 3/19/2025

Supplementary reading

• Read the tutorial papers on survival analysis.

Homework

• Homework can be done using any word processors, e.g. Words, LaTex, Rmarkdown

• Please submit your R code with homework

Question 1

An investigator would like to conduct a randomized phase III trial to compare a new treatment (arm A) against a standard control (arm B) for patients with advanced colorectal cancer. The primary endpoint will be overall survival (OS), which is the time between treatment initiation and death. Eligible patients will be randomized with equal allocation to the two treatments.

Except for the part (f), please answer the following questions using the formula given in the course slides and show the main steps that you take to obtain the answers.

(a) Under the exponential model S(t) = exp(-λt), show that the median OS m = log(2)/λ . Assuming that the median OS for arm A is 17 months and the median OS for arm B is 12 months, calculate the hazard rates for the two treatments and the hazard ratio for arm A over arm B.

(b) Formulate one-sided null hypothesis and alternative hypothesis using log hazard ratio as the parameter of interest.

(c) Calculate the number of deaths required to test the hypotheses with one-sided type I error of 0.025 and type II error of 0.15.

(d) Assuming a uniform enrollment at a rate of 10 patients per month and all patients will be followed from randomization to the time of final analysis, which is the time that the required number of deaths is reached. Calculate the number of patients to achieve the required number of deaths using R stats::uniroot for the following three scenarios: (i) the accrual length is equal to the study duration; (ii) the number of patients is 150% of the number of deaths (i.e. 50% patients are dead at the final analysis); and (iii) the follow up time is 24 months. Explain which design you like the better.

(e) If there are 10% patients are expected to dropout or some patients receiving additional treatment after disease progression before reaching the time of final analysis, what impact of these will affect the power of the study and what strategy you can adopt in the stage of trial design to overcome this problem.

(f) Use R function gsDesign::nSurv or other appropriate functions to verify your answers for the parts (c) and (d). Please provide the R code and output. Compare the numerical results from gsDesign with your manual calculations.

Question 2

In a fixed sample size trial, a total of n = 378 patients will be randomized with equal allocation to two treatment groups. The outcome of the patients in treatment 1 forms a simple random sample from X1 ~ N(µ1 , σ 2 ) and the outcome of the patients in treatment 0 forms simple random sample from X2 ~ N(µ0 , σ 2 ). The fixed sample size design has a two-sided type I error of 0.05 and 90% power in testing H0 : ∆ = 0 versus Ha : ∆ = 1/3. To convert this design into a group sequential trial, we decide to conduct five analyses (four interim analyses and one final analysis) to test the mean difference of the two treatment groups ∆ = µ 1 - µ0 with a 2-sided test statistic

where n1k and n0k are the numbers of patients enrolled to treatments 1 and 0, respectively, at analysis k and we assume that σ 2 is known and equal to 1 for both groups.

(a) Assuming the five analyses will be conducted at five equally spaced times tk = k × n/5 for k = 1, · · · , 5, calculate the critical values for Pocock and O’Brien-Fleming boundaries and the maximum sample sizes with two-sided α = 0.05. You can use R gsDesign :: gsDesign function to calculate these boundaries are k = 5, test.type = 2, alpha = 0.025, beta = 0.1, n.fix = 378, and sfu = ”Pocock ” or sfu = ”OF ” for the two boundaries. By specifcying test.type = 2 we are requesting symmetric boundaries in which the boundaries in term of critical value and for significance are symmetric in magnitude on both sides of the null hypothesis.

(b) For each of the group sequential trial designs in (a), complete a separate simulation study (5,000 simulations) where enrollment times are uniform over a five-year period and conduct the interim analyses at the end of each year with the test statistic using the data cumulated to the kth year. Calculate the empirical type I errors and comment on your simulation findings.

Hint: To calculate the empirical type I error, you need to generate the data under the null hypothesis H0 : ∆ = 0 and conduct the interim analysis using the group sequential test in each of the five looks. For each simulation, you record whether the null hypothesis is rejected or not (i. e. the boundaries are crossed or not) at each simulation run. The proportion of H0 rejection of 5,000 simulations will be the empirical type I error for the corresponding boundary.

(c) For each of the two group sequential trial designs, complete a simulation study (5,000 simulations) under Ha : ∆ = µ 1 - µ0 = 1/3 and the enrollment times are uniform over a five-year period. The interim analyses are conducted at the end of each year with the test statistic using the data cumulated to the kth year. Calculate the expected sample sizes under Ha : ∆ = 1/3 for both boundary types and comment on your simulation findings.

(d) Compare your simualtion results in the parts (b) and (c) with the numberical outputs from the R functions gsDesign and gsBoundSummary in R package gsDesign.