首页 >
> 详细

STAT 3312 (Fall, 2019)

Final exam (take-home)

Name (ID):

Instructions

• This take-home exam is due 3:00PM, December 17, 2019.

• All of your answers and work must be your own.

• You are NOT allowed to discuss any part of this exam with anyone. If you have any questions,

ask me.

• For question #2, R or SAS code along with output must me submitted to support your

answer. It would be good if you underline results on the output relevant to your answer.

1. True/False questions (1.5 points each)

(1) The diagnosis of a mental illness (ex: schizophrenia, neurosis, depression) is an ordinal categorical

variable.

True ( ) False ( )

(2) If the odds of success equal 0.5 in a binary response, the the probability of success is 0.25.

True ( ) False ( )

(3) In a logistic regression model, logit[π(x)] = α + βx, e

α equals the odds of success when x = 1.

True ( ) False ( )

(4) In a logit model logit[π(x)] = α+βx, the probability increases at a rate of 0.16β when π(x) = 0.4.

True ( ) False ( )

(5) The Fisher’s exact test can be used to test if the odds ratio of a 2 × 2 table equals 1 when the

frequency counts are small.

True ( ) False ( )

(6) A classical linear regression model with errors having a normal distribution is a special case of

generalized linear model with the probit link.

True ( ) False ( )

1

(7) In testing for independence in two-way contingency tables, likelihood ratio tests and Pearson’s

χ

2

tests are equivalent for small sample sizes.

True ( ) False ( )

(8) In a generalized linear model, the link function is used to connect the values of the random

component and the systematic component.

True ( ) False ( )

(9) When x1 or x2 is the sole predictor for a binary response y, the likelihood ratio test of the effect

has P-value < 0.0001. When both x1 and x2 are in the model, it is possible that the likelihood

tests for H0 : β1 = 0 and for H0 : β2 = 0 could both have P-values larger than 0.05.

True ( ) False ( )

(10) For the logistic regression model with the identity link, the estimated probability of any value

for predictor x could exceed one.

True ( ) False ( )

2. The following table is based on an epidemiological survey of 3,000 subjects to investigate snoring

as a possible factor for heart disease. We use scores (0, 2, 3, 5, 6) for x = snoring level.

Heart Disease

Snoring Yes No

Never 24 1355

Sometimes 35 603

More often than not 21 215

Almost always 30 224

Every night 27 230

(a) Use R or SAS to fit the model with three link functions: the logit, probit, and complementary

Log-Log. Write down the estimated equations for all three models. (12 points)

2

(b) Find the estimated proportion for the logistic model when the snoring level is 2 and interpret

it in terms of the odds. (4 points)

(c) Use the fitted logistic model to calculate an approximate 97% confidence interval for the odds

ratio of a person in the “sometimes” category compared to a person in the “every night” category.

(5 points)

(d) Find the estimated proportion for the probit model when the snoring level is 3. (4 points)

(e) Find the estimated proportions for the complementary Log-Log model when the snoring levels

are “sometimes” and “almost always”, respectively. Which value is larger? (5 points)

3. Consider the following logistic regression model based on the horseshoe data with color and

width predictors:

logit[P(Y = 1)] = α + β1c1 + β2c2 + β3c3 + β4x,

where x denotes width and

c1 = 1 for color = medium light, 0 otherwise

c2 = 1 for color = medium, 0 otherwise

3

c3 = 1 for color = medium dark, 0 otherwise.

Fitting the model yields the following estimated equation:

logit[P(Yd= 1)] = −13.015 + 1.097c1 + 1.302c2 + 1.254c3 + 0.458x. (1)

Consider this fit for crabs of width x = 21cm.

(a) Estimate two probabilities for medium-light crabs and for dark crabs, and then calculate the

ratio of these two probabilities. (7 points)

(b) Estimate the odds ratio of a satellite for medium-light crabs and for dark crabs. Interpret it in

terms of the context. (7 points)

(c) Is there a big difference between the ratio of probabilities in (a) and the odds ratio in (c)? If

not, why does this happen? (5 points)

(d) Verify the value of the odds ratio in part (b) using the parameter estimates in Equation (1). (5

points)

4

4. In order to investigate effects of AZT in slowing the development of AIDS symptoms, a total of

343 veterans whose immune systems were beginning to falter after infection from the AIDS virus

were randomly assigned either to receive AZT immediately or to wait until their T cells showed

severe immune weakness. The following table is a 2 ×2×2 cross classification of the veteran’s race,

whether AZT was given immediately, and whether AIDS symptoms developed during the 3-year

study.

Symptoms

Race AZT use Yes (Fitted) No (Fitted) Row total

Black Yes 14 (A) 90 (B) 104

No 28 (C) 85 (D) 113

White Yes 10 (E) 55 (F) 65

No 14 (G) 47 (H) 61

Let X = AZT treatment (1 for AZT taken, 0 otherwise), Z = race (1 for blacks, 0 for whites), and

Y = whether AIDS symptoms developed (1 = yes, 0 = no). The ML fit turned out to be

logit(ˆπ) = −1.1427 − 0.6537x − 0.0037z. (2)

(a) Use Equation (2) to find the fitted values (A) - (H). (8 points)

5

(b) Perform a goodness of fit test by calculating the Pearson statistic X2 based on the observed

and fitted values in the table above. Does the model fit decently well? Justify your answer with

the P-value. (8 points)

6

5. Does job satisfaction depend on one’s income? The 1991 General Society Survey shows the

following results. Note that there are four levels in the job satisfaction categories (dissatisfied,

little, moderate, very) and four levels in the income categories (0-5K, 5K-15K, 15K-25K, >35K).

The income values are in dollars.

Income Job satisfaction

Dissatisfied Little Moderate Very

0-5K 2 4 13 3

5K-15K 2 6 22 4

15K-25K 0 1 15 8

>25K 0 3 13 8

Let Y = job satisfaction and let X = income scores (3K, 10K, 20K, 25K). Consider the baselinecategory

logit model with “very” as the baseline category:

log(πjπ4) = αj + βjx, j = 1, 2, 3.

The following table shows a part of the output regarding the estimated coefficients for a baselinecategory

logit model.

(Intercept):1 (Intercept):2 (Intercept):3

0.430 0.456 1.704

Income:1 Income:2 Income:3

−0.185 −0.054 −0.037

(a) Write down the three predicted equations, log(ˆπj/πˆ4) for j = 1, 2, 3. (6 points)

(b) Notice that βˆ

j < 0 for each logit. Interpret the implications in terms of the text. (4 points)

(c) What is the meaning of e

−0.185 = 0.83? Explain it rigourously in terms of the context. (4

points)

(d) Find the estimated probability of being “Moderate” category when his/her income is 20K. (4

points)

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp

- Emergency Facilities作业代写、代写r编程设计作业、R课程 2020-01-18
- Cis 413/513作业代做、代写data Structures作业、Ja 2020-01-18
- 代写ia626留学生作业、Python程序设计作业调试、代做data课程作业 2020-01-18
- Mat00027i作业代写、Java程序语言作业调试、Mathematica 2020-01-17
- 代做kt Model作业、代写java，Python编程设计作业、代做c/C 2020-01-17
- Data Set课程作业代做、代写r程序语言作业、Ltcret留学生作业代做 2020-01-17
- 代写rstudio留学生作业、代做r编程设计作业、代写r课程设计作业代做数据 2020-01-17
- 代写cs2250 Delimiter Matching代做数据结... 2020-01-16
- 代写cs12b Edit Distance帮写java实验作业... 2020-01-16
- 代写mins325 Filereader And Filewriter代... 2020-01-16
- 代写cosi131 Tunnels帮写java实验作业 2020-01-16
- 代写inm312 Balancebit Software代写留学... 2020-01-16
- 代写cs61b Maze Solver代写java课程设计 2020-01-16
- Program留学生作业代做、C/C++编程语言作业代写、代做java，Py 2020-01-14
- 代写cse 535留学生作业、代做matlab程序语言作业、Matlab课程 2020-01-14
- Cse 535作业代写、代做mobile Computing作业、代写pyt 2020-01-14
- Isys90086作业代写、代做data留学生作业、代写sql编程设计作业、 2020-01-13
- 代做留学生prolog|代写r语言程序|代做sps... 2020-01-13
- Csse2002/7023 —Assignment 3 2020-01-13
- Infs2200/7903 Project Assignment 2020-01-13