首页 >
> 详细

STAT 3312 (Fall, 2019)

Final exam (take-home)

Name (ID):

Instructions

• This take-home exam is due 3:00PM, December 17, 2019.

• All of your answers and work must be your own.

• You are NOT allowed to discuss any part of this exam with anyone. If you have any questions,

ask me.

• For question #2, R or SAS code along with output must me submitted to support your

answer. It would be good if you underline results on the output relevant to your answer.

1. True/False questions (1.5 points each)

(1) The diagnosis of a mental illness (ex: schizophrenia, neurosis, depression) is an ordinal categorical

variable.

True ( ) False ( )

(2) If the odds of success equal 0.5 in a binary response, the the probability of success is 0.25.

True ( ) False ( )

(3) In a logistic regression model, logit[π(x)] = α + βx, e

α equals the odds of success when x = 1.

True ( ) False ( )

(4) In a logit model logit[π(x)] = α+βx, the probability increases at a rate of 0.16β when π(x) = 0.4.

True ( ) False ( )

(5) The Fisher’s exact test can be used to test if the odds ratio of a 2 × 2 table equals 1 when the

frequency counts are small.

True ( ) False ( )

(6) A classical linear regression model with errors having a normal distribution is a special case of

generalized linear model with the probit link.

True ( ) False ( )

1

(7) In testing for independence in two-way contingency tables, likelihood ratio tests and Pearson’s

χ

2

tests are equivalent for small sample sizes.

True ( ) False ( )

(8) In a generalized linear model, the link function is used to connect the values of the random

component and the systematic component.

True ( ) False ( )

(9) When x1 or x2 is the sole predictor for a binary response y, the likelihood ratio test of the effect

has P-value < 0.0001. When both x1 and x2 are in the model, it is possible that the likelihood

tests for H0 : β1 = 0 and for H0 : β2 = 0 could both have P-values larger than 0.05.

True ( ) False ( )

(10) For the logistic regression model with the identity link, the estimated probability of any value

for predictor x could exceed one.

True ( ) False ( )

2. The following table is based on an epidemiological survey of 3,000 subjects to investigate snoring

as a possible factor for heart disease. We use scores (0, 2, 3, 5, 6) for x = snoring level.

Heart Disease

Snoring Yes No

Never 24 1355

Sometimes 35 603

More often than not 21 215

Almost always 30 224

Every night 27 230

(a) Use R or SAS to fit the model with three link functions: the logit, probit, and complementary

Log-Log. Write down the estimated equations for all three models. (12 points)

2

(b) Find the estimated proportion for the logistic model when the snoring level is 2 and interpret

it in terms of the odds. (4 points)

(c) Use the fitted logistic model to calculate an approximate 97% confidence interval for the odds

ratio of a person in the “sometimes” category compared to a person in the “every night” category.

(5 points)

(d) Find the estimated proportion for the probit model when the snoring level is 3. (4 points)

(e) Find the estimated proportions for the complementary Log-Log model when the snoring levels

are “sometimes” and “almost always”, respectively. Which value is larger? (5 points)

3. Consider the following logistic regression model based on the horseshoe data with color and

width predictors:

logit[P(Y = 1)] = α + β1c1 + β2c2 + β3c3 + β4x,

where x denotes width and

c1 = 1 for color = medium light, 0 otherwise

c2 = 1 for color = medium, 0 otherwise

3

c3 = 1 for color = medium dark, 0 otherwise.

Fitting the model yields the following estimated equation:

logit[P(Yd= 1)] = −13.015 + 1.097c1 + 1.302c2 + 1.254c3 + 0.458x. (1)

Consider this fit for crabs of width x = 21cm.

(a) Estimate two probabilities for medium-light crabs and for dark crabs, and then calculate the

ratio of these two probabilities. (7 points)

(b) Estimate the odds ratio of a satellite for medium-light crabs and for dark crabs. Interpret it in

terms of the context. (7 points)

(c) Is there a big difference between the ratio of probabilities in (a) and the odds ratio in (c)? If

not, why does this happen? (5 points)

(d) Verify the value of the odds ratio in part (b) using the parameter estimates in Equation (1). (5

points)

4

4. In order to investigate effects of AZT in slowing the development of AIDS symptoms, a total of

343 veterans whose immune systems were beginning to falter after infection from the AIDS virus

were randomly assigned either to receive AZT immediately or to wait until their T cells showed

severe immune weakness. The following table is a 2 ×2×2 cross classification of the veteran’s race,

whether AZT was given immediately, and whether AIDS symptoms developed during the 3-year

study.

Symptoms

Race AZT use Yes (Fitted) No (Fitted) Row total

Black Yes 14 (A) 90 (B) 104

No 28 (C) 85 (D) 113

White Yes 10 (E) 55 (F) 65

No 14 (G) 47 (H) 61

Let X = AZT treatment (1 for AZT taken, 0 otherwise), Z = race (1 for blacks, 0 for whites), and

Y = whether AIDS symptoms developed (1 = yes, 0 = no). The ML fit turned out to be

logit(ˆπ) = −1.1427 − 0.6537x − 0.0037z. (2)

(a) Use Equation (2) to find the fitted values (A) - (H). (8 points)

5

(b) Perform a goodness of fit test by calculating the Pearson statistic X2 based on the observed

and fitted values in the table above. Does the model fit decently well? Justify your answer with

the P-value. (8 points)

6

5. Does job satisfaction depend on one’s income? The 1991 General Society Survey shows the

following results. Note that there are four levels in the job satisfaction categories (dissatisfied,

little, moderate, very) and four levels in the income categories (0-5K, 5K-15K, 15K-25K, >35K).

The income values are in dollars.

Income Job satisfaction

Dissatisfied Little Moderate Very

0-5K 2 4 13 3

5K-15K 2 6 22 4

15K-25K 0 1 15 8

>25K 0 3 13 8

Let Y = job satisfaction and let X = income scores (3K, 10K, 20K, 25K). Consider the baselinecategory

logit model with “very” as the baseline category:

log(πjπ4) = αj + βjx, j = 1, 2, 3.

The following table shows a part of the output regarding the estimated coefficients for a baselinecategory

logit model.

(Intercept):1 (Intercept):2 (Intercept):3

0.430 0.456 1.704

Income:1 Income:2 Income:3

−0.185 −0.054 −0.037

(a) Write down the three predicted equations, log(ˆπj/πˆ4) for j = 1, 2, 3. (6 points)

(b) Notice that βˆ

j < 0 for each logit. Interpret the implications in terms of the text. (4 points)

(c) What is the meaning of e

−0.185 = 0.83? Explain it rigourously in terms of the context. (4

points)

(d) Find the estimated probability of being “Moderate” category when his/her income is 20K. (4

points)

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codehelp

- Data留学生作业代写、代做analysis课程作业、代写r语言作业、R编程 2020-04-03
- 代写ec566留学生作业、代做data课程作业、代写python程序设计作业 2020-04-03
- Csci 3136作业代写、代做programming作业、C++程序设计作 2020-04-03
- Programming作业代做、代做c++语言作业、代写data留学生作业、 2020-04-03
- Ec566课程作业代写、代写python程序设计作业、代做data作业、Py 2020-04-03
- 代写5ent1070作业、代做aims留学生作业、Sql程序语言作业调试、S 2020-04-03
- 5Ent1070作业代做、代写web Services作业、Cs编程语言作业 2020-04-03
- Csi2120作业代做、代写programming作业、Java语言作业代做 2020-04-02
- 代写comp 1039作业、代做python课程作业、Python程序语言作 2020-04-02
- 代做cetm51留学生作业、Networks课程作业代写、Java，C/C+ 2020-04-02
- 代写comp1039作业、代做java程序语言作业、代做data课程作业、J 2020-04-02
- Cs3103课程作业代做、Programming作业代写、C/C++语言作业 2020-04-02
- Csci 2134作业代做、代写python，Java，C/C++程序语言作 2020-04-02
- Csye 7374作业代做、代做systems课程作业、代写c/C++程序语 2020-04-02
- 代写cs300留学生作业、Java程序语言作业调试、Java实验作业代做、代 2020-04-02
- 代写dataset留学生作业、代做c++,Java，Python程序语言作业 2020-04-01
- Comp 8042作业代做、代写c/C++程序语言作业、代做g++课程设计作 2020-04-01
- 代写cs304留学生作业、代做c++编程设计作业、代写c/C++课程作业、D 2020-04-01
- Cs544留学生作业代做、Programming作业代写、R编程设计作业代做 2020-04-01
- Csc73010作业代写、代做programming作业、Java语言作业代 2020-04-01