首页 >
> 详细

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 1

MAT 4378 – MAT 5317, Analysis of categorical data

Assignment 3

Due date: in class on Monday, November 18, 2019

Remark: You can use R for your computations for Questions 2 to 4. If you use

R please provide the output. However, the R output is not an answer to a question.

Please provide one or two sentences to properly answer the question.

1. Consider a ratio estimator h(ˆθ1,ˆθ2) = ˆθ1/ˆθ2, where the estimated variancecovariance

2. A carefully controlled experiment was conducted to study the effect of the size of

the deposit level on the likelihood that a returnable one-liter soft drink bottle

will be returned. The data to follow show the number of bottles that were

returned (Wi) out of 500 sold (ni) at each of size deposit levels (Xi

in cents):

Deposit level xi 2 5 10 20 25 30

Number sold ni 500 500 500 500 500 500

Number returned wi 72 103 170 296 406 449

An analysist believes that a logistic regression model is appropriate for studying

the relation between the size of the deposit and the probability a bottle will be

returned.

(a) Find the maximum likelihood estimates for β0 and β1. Give the estimated

regression model.

(b) Obtain a scatter plot of the sample proportions against the level of the

deposit, and superimpose the estimated logistic response onto the plot.

Does the fitted logistic response function appear to fit well?

(c) Obtain exp(βˆ

1) and interpret this number.

(d) What is the estimated probability that a bottle will be returned when the

deposit is 15 cents?

(e) Estimate the amount of deposit for which 75% of the bottles are expected

to be returned.

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 2

(f) In part (e), we have an estimate ˆx = g(βˆ

0, βˆ

1) for the level of the deposit

that corresponds to π = 75% of the bottles are returned. This estimator is

a non-linear function of βˆ

0, βˆ

1. Use the delta-method to find an asymptotic

estimated standard error for this estimate. Hint: It will be helpful to

use the function vcov on your glm object. Furthermore, to multiply the

matrices A and B with R use A %*% B.

3. A marketing research firm was engaged by an automobile manufacturer to conduct

a pilot study to examine the feasibility of using logistic regression for

ascertaining the likelihood that a family will purchase a new car during the

next year. A random sample of 33 suburban families was selected. Data on

annual family income (x1, in thousands of dollars) and the current age of the

oldest family automobile (x2, in years) were obtained. A followup interview

conducted 12 months later was used to determine whether the family actually

purchased a new car (y = 1) or did not purchase a new car (y = 0) during the

year. The data is found in the file CarPurchase.csv.

(a) Find the maximum likelihood estimates of β0, β1, and β2. State the estimated

logistic regression model.

(b) Obtain exp(βˆ1) and exp(βˆ2) and interpret these numbers.

(c) What is the estimated probability that a family with annual income of $50

thousand and an oldest car of 3 years will purchase a new car next year?

4. Rather than finding the probability of success at an explanatory variable value,

it is often of interest to find the value of an explanatory variable given a desired

probability of success. This is referred to as inverse prediction. One application

of inverse prediction involves finding the amount of pesticide or herbicide needed

to have a desired kill rate when applied to pests or plants. The lethal dose level

xπ (commonly called “LDz”, where z = 100 π is defined as

xπ =(cloglog(π) − β0)β1

for the complementary log-log regression model

cloglog(π) = β0 + β1 x.

(a) Show how xπ is derived by solving for x in the complementary log-log

regression model.

(b) We can obtain 95% confidence interval for xπ as follows:

Describe how this confidence interval for xπ is derived. (Note that there is

generally no closed-form solution for the confidence interval limits, which

leads to the use of iterative numerical procedures.)

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 3

(c) Turner et al. (1992) uses logistic regression to estimate the rate at which

picloram, a herbicide, kills tall larkspur, a weed. Their data was collected

by applying four different levels of picloram to separate plots, and the

number of weeds killed out of the number of weeds within the plot was

recorded. The data are in the file picloram.csv. Complete the following:

(i) We will use a cloglog model instead of a logistic regression model. Give

the estimated complementary log-log model.

(ii) Compute eβˆ1 and interpret this number within the context of the problem.

(iii) Plot the observed proportion of killed weeds and the estimated model.

Describe how well the model fits the data.

Note: Here are some commands that you might find helpful. We are

assuming that the dataframe is called picloram.data and that the

fitted model is called mod.

## plot proportions versus x

with(picloram.data, plot(x = picloram, y = kill/total,

xlab = "Picloram", ylab = "Proportion of weeds killed",

panel.first = grid(col = "gray", lty = "dotted")))

# Put estimated esimated response on the plot

curve(expr = predict(object = mod,

newdata = data.frame(picloram = x), type = "response"),

col = "red", add = TRUE)

(iv) Estimate the 0.9 kill rate level “LD90” for picloram. Add lines to the

plot in (iii) to illustrate how it is found (the segments() function can

be useful for this purpose).

(v) We are assuming that your fitted model is the glm object mod. Use

the following commands to compute a 95% confidence interval for the

0.9 kill rate. Note: The function uniroot solves for the root of a

function over an interval.

b0 = summary(mod)$coefficients[1,1]

b1 = summary(mod)$coefficients[2,1]

LD.x<-(log(-log(1-0.9))-b0)/b1

root.func <- function(x, mod.obj, pi0, alpha) {

beta.hat <- mod.obj$coefficients

cov.mat <- vcov(mod.obj)

var.den <- cov.mat[1,1] + x^2*cov.mat[2,2] +

2*x*cov.mat[1,2]

abs(beta.hat[1] + beta.hat[2]*x - log(-log(1-pi0)))/

sqrt(var.den) - qnorm(1-alpha/2) }

lower <- uniroot(f = root.func, interval =

c(min(picloram.data$picloram), LD.x),

mod.obj = mod, pi0 = 0.9, alpha = 0.05)

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 4

upper <- uniroot(f = root.func, interval =

c(LD.x, max(picloram.data$picloram)),

mod.obj = mod, pi0 = 0.9, alpha = 0.05)

lower$root

upper$root

(vi) In part (v), we found a 95% CI for x0.9. Explain in a few sentences

how these commands give us the lower and the upper bound of the

confidence interval.

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp

- Comp201作业代做、Software Engineering作业代写、J 2019-12-07
- Comp3322a作业代做、代写modern Technologies作业、 2019-12-07
- Cse315留学生作业代写、代做software Engineering作业 2019-12-07
- 代写cse403留学生作业、代做java程序语言作业、System课程作业代 2019-12-07
- Cse-381作业代做、代写canvas留学生作业、代做c++语言作业、C+ 2019-12-07
- Stat 315作业代写、Linear Relationship作业代写、代 2019-12-07
- Cs602留学生作业代做、代写programming课程作业、代做pytho 2019-12-07
- Math5714作业代做、代写linear Regression作业、R编程 2019-12-07
- Ista 116作业代写、Data留学生作业代做、代写r程序设计作业、R语言 2019-12-07
- 代做data留学生作业、代写r编程语言作业、代做r课程设计作业代写r语言编程 2019-12-07
- Sehs3321作业代做、代做web，Html编程语言作业、代写networ 2019-12-07
- Stat2005作业代写、代做r编程设计作业、代写programming课程 2019-12-07
- 代做data Set作业、代写python，Java编程语言作业、代做c/C 2019-12-07
- Cis 212留学生作业代做、代写c/C++编程设计作业、代做c/C++语言 2019-12-06
- 代做csi 403留学生作业、Data Structures作业代写、代做j 2019-12-06
- 代做bpi889留学生作业、代写r编程语言作业、R课程设计作业代做、Data 2019-12-06
- 代写website留学生作业、代做python程序设计作业、代写python 2019-12-06
- Comp201作业大写、代做software Engineering作业、代 2019-12-06
- Game Srs作业代做、代写linux Platforms作业、Java编 2019-12-06
- 代写stat 462/862作业、代做python编程设计作业、代写java 2019-12-06