STAT3600 LINEAR STATISTICAL ANALYSIS
May 20, 2024
1. Five observations of weight-adjusted waist index (X) and total bone mineral density (Y) are given as follows.
Consider a linear regression model when total bone mineral density is regressed on weight-adjusted waist index. It is given that
(a) Calculate the least squares estimates of the intercept and the slope. Interpret the estimates quantitatively. [6 marks]
(b) Calculate the standard errors of the estimated intercept and slope. [6 marks|
(c) Construct a 95% confidence interval for each parameter. [4 marks]
(d) Test at the 5% level of significance whether the slope is -1. [1 mark]
(e) Estimate the mean for total bone mineral density when weight-adjusted waist index = 9.8. Construct a 95% confidence interval for the estimate. [4 marks|
[Total: 21 marks]
2. A study is conducted to investigate how daily caffeine intake affects the risk of depression in both the cancer and noncancer populations. A depression scale PHQ-9 is used to measure the severity of depression of the subjects. The caffeine intake is categorized as four levels. The data are given as follows.
(a) Calculate and plot the means for PHQ-9 for the six treatments. Does it appear interaction effects between cancer status and Caffeine intake? Explain. [6 marks]
(b) Estimate the main effects of Cancer, main effects of Caffeine intake and the effects of interaction. [6 marks|
(c) Complete the following two-way ANOVA table.
[6 marks]
(d) Test whether or not main effects for Cancer are present, using a 5% significance level. [3 marks]
(e) Compare the mean PHQ-9 between the two cancer groups for each of the three Caffeine intake quartiles by constructing the at least 95% simultaneous confidence intervals using Bonferroni’s method for the three quartiles. Hence, describe the difference between the depression of cancer and noncancer patients. [7 marks|
[Total: 28 marks]
3. A general linear model is employed to study the effects of a 2-level factor A, a continuous regressor x and their interaction on a response variable Y. ‘The model is given as follows.
subject to
The data are summaized as follows. The matrices are properly arranged according to the above model.
(a) Write down the fitted regression model. [4 marks]
(b) Complete the following ANOVA table.
[6 marks]
(c) Test the interaction effects at the 5% level of significance. [4 marks]
(d) Consider a model without the interaction term.
i. Write down the fitted regression model. [3 marks]
ii. Test the main effects of factor A at the 5% level of significance. [4 marks]
[Total: 21 marks]
4. Consider a regression analysis of Y on X1 — X3 for 30 observations. The SSE for various sub-models are given below.
(a) Determine the subset of variables that is selected by the backward elimination method, based on the removal level F = 4. Show your steps. Report the MSE and F-value at each step. Report the selected regressors. [7 marks]
(b) Among the four subsets of variables,
determine the subset that is selected by Cp. Produce a Cp plot. Report the selected subset. [7 marks]
[Total: 14 marks]
5. Two regression lines are given as follows.
(a) Show that the F-statistic for testing
can be put in the form.
[5 marks]
(b) Obtain SSE, the error sum of squares under H,, and SS Eypo, the error sum of squares under Hp. [6 marks]
(c) Show that
[5 marks|
[Total: 16 marks]