首页 >
> 详细

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 1

MAT 4378 – MAT 5317, Analysis of categorical data

Assignment 3

Due date: in class on Monday, November 18, 2019

Remark: You can use R for your computations for Questions 2 to 4. If you use

R please provide the output. However, the R output is not an answer to a question.

Please provide one or two sentences to properly answer the question.

1. Consider a ratio estimator h(ˆθ1,ˆθ2) = ˆθ1/ˆθ2, where the estimated variancecovariance

2. A carefully controlled experiment was conducted to study the effect of the size of

the deposit level on the likelihood that a returnable one-liter soft drink bottle

will be returned. The data to follow show the number of bottles that were

returned (Wi) out of 500 sold (ni) at each of size deposit levels (Xi

in cents):

Deposit level xi 2 5 10 20 25 30

Number sold ni 500 500 500 500 500 500

Number returned wi 72 103 170 296 406 449

An analysist believes that a logistic regression model is appropriate for studying

the relation between the size of the deposit and the probability a bottle will be

returned.

(a) Find the maximum likelihood estimates for β0 and β1. Give the estimated

regression model.

(b) Obtain a scatter plot of the sample proportions against the level of the

deposit, and superimpose the estimated logistic response onto the plot.

Does the fitted logistic response function appear to fit well?

(c) Obtain exp(βˆ

1) and interpret this number.

(d) What is the estimated probability that a bottle will be returned when the

deposit is 15 cents?

(e) Estimate the amount of deposit for which 75% of the bottles are expected

to be returned.

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 2

(f) In part (e), we have an estimate ˆx = g(βˆ

0, βˆ

1) for the level of the deposit

that corresponds to π = 75% of the bottles are returned. This estimator is

a non-linear function of βˆ

0, βˆ

1. Use the delta-method to find an asymptotic

estimated standard error for this estimate. Hint: It will be helpful to

use the function vcov on your glm object. Furthermore, to multiply the

matrices A and B with R use A %*% B.

3. A marketing research firm was engaged by an automobile manufacturer to conduct

a pilot study to examine the feasibility of using logistic regression for

ascertaining the likelihood that a family will purchase a new car during the

next year. A random sample of 33 suburban families was selected. Data on

annual family income (x1, in thousands of dollars) and the current age of the

oldest family automobile (x2, in years) were obtained. A followup interview

conducted 12 months later was used to determine whether the family actually

purchased a new car (y = 1) or did not purchase a new car (y = 0) during the

year. The data is found in the file CarPurchase.csv.

(a) Find the maximum likelihood estimates of β0, β1, and β2. State the estimated

logistic regression model.

(b) Obtain exp(βˆ1) and exp(βˆ2) and interpret these numbers.

(c) What is the estimated probability that a family with annual income of $50

thousand and an oldest car of 3 years will purchase a new car next year?

4. Rather than finding the probability of success at an explanatory variable value,

it is often of interest to find the value of an explanatory variable given a desired

probability of success. This is referred to as inverse prediction. One application

of inverse prediction involves finding the amount of pesticide or herbicide needed

to have a desired kill rate when applied to pests or plants. The lethal dose level

xπ (commonly called “LDz”, where z = 100 π is defined as

xπ =(cloglog(π) − β0)β1

for the complementary log-log regression model

cloglog(π) = β0 + β1 x.

(a) Show how xπ is derived by solving for x in the complementary log-log

regression model.

(b) We can obtain 95% confidence interval for xπ as follows:

Describe how this confidence interval for xπ is derived. (Note that there is

generally no closed-form solution for the confidence interval limits, which

leads to the use of iterative numerical procedures.)

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 3

(c) Turner et al. (1992) uses logistic regression to estimate the rate at which

picloram, a herbicide, kills tall larkspur, a weed. Their data was collected

by applying four different levels of picloram to separate plots, and the

number of weeds killed out of the number of weeds within the plot was

recorded. The data are in the file picloram.csv. Complete the following:

(i) We will use a cloglog model instead of a logistic regression model. Give

the estimated complementary log-log model.

(ii) Compute eβˆ1 and interpret this number within the context of the problem.

(iii) Plot the observed proportion of killed weeds and the estimated model.

Describe how well the model fits the data.

Note: Here are some commands that you might find helpful. We are

assuming that the dataframe is called picloram.data and that the

fitted model is called mod.

## plot proportions versus x

with(picloram.data, plot(x = picloram, y = kill/total,

xlab = "Picloram", ylab = "Proportion of weeds killed",

panel.first = grid(col = "gray", lty = "dotted")))

# Put estimated esimated response on the plot

curve(expr = predict(object = mod,

newdata = data.frame(picloram = x), type = "response"),

col = "red", add = TRUE)

(iv) Estimate the 0.9 kill rate level “LD90” for picloram. Add lines to the

plot in (iii) to illustrate how it is found (the segments() function can

be useful for this purpose).

(v) We are assuming that your fitted model is the glm object mod. Use

the following commands to compute a 95% confidence interval for the

0.9 kill rate. Note: The function uniroot solves for the root of a

function over an interval.

b0 = summary(mod)$coefficients[1,1]

b1 = summary(mod)$coefficients[2,1]

LD.x<-(log(-log(1-0.9))-b0)/b1

root.func <- function(x, mod.obj, pi0, alpha) {

beta.hat <- mod.obj$coefficients

cov.mat <- vcov(mod.obj)

var.den <- cov.mat[1,1] + x^2*cov.mat[2,2] +

2*x*cov.mat[1,2]

abs(beta.hat[1] + beta.hat[2]*x - log(-log(1-pi0)))/

sqrt(var.den) - qnorm(1-alpha/2) }

lower <- uniroot(f = root.func, interval =

c(min(picloram.data$picloram), LD.x),

mod.obj = mod, pi0 = 0.9, alpha = 0.05)

MAT 4378 – MAT 5317, Analysis of categorical data, Assignment 3 4

upper <- uniroot(f = root.func, interval =

c(LD.x, max(picloram.data$picloram)),

mod.obj = mod, pi0 = 0.9, alpha = 0.05)

lower$root

upper$root

(vi) In part (v), we found a 95% CI for x0.9. Explain in a few sentences

how these commands give us the lower and the upper bound of the

confidence interval.

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp2

- Tsp课程作业代写、代做algorithms留学生作业、代做java，C/C 2020-06-23
- Kit107留学生作业代做、C++编程语言作业调试、Data课程作业代写、代 2020-06-23
- Sta302h1f作业代做、代写r课程设计作业、代写r编程语言作业、代做da 2020-06-22
- 代写seng 474作业、代做data Mining作业、Python，Ja 2020-06-22
- Cmpsci 187 Binary Search Trees 2020-06-21
- Comp226 Assignment 2: Strategy 2020-06-21
- Math 504 Homework 12 2020-06-21
- Math4007 Assessed Coursework 2 2020-06-21
- Optimization In Machine Learning Assig... 2020-06-21
- Homework 1 – Math 104B 2020-06-20
- Comp1000 Unix And C Programming 2020-06-20
- General Specifications Use Python In T... 2020-06-20
- Comp-206 Mini Assignment 6 2020-06-20
- Aps 105 Lab 9: Search And Link 2020-06-20
- Aps 105 Lab 9: Search And Link 2020-06-20
- Mech 203 – End-Of-Semester Project 2020-06-20
- Ms980 Business Analytics 2020-06-20
- Cs952 Database And Web Systems Develop... 2020-06-20
- Homework 4 Using Data From The China H... 2020-06-20
- Assignment 1 Build A Shopping Cart 2020-06-20