首页 >
> 详细

MA308: Statistical Calculation and Software

Assignment 2 (Oct 9– Nov 6, 2019)

2.1 For the “galton” dataset from Using R package,

(a) What will be the conclusion for testing the height of the child at ↵ = 0.05 level

of significance,

H0 : µ = 68, v.s. H1 : µ 6= 68,

given that variance is known to be 1.7873.

(b) If the variance is unknown in (a), carry out the likelihood-ratio test and draw

the conclusion at ↵ = 0.05 level of significance. Compare the result with that

of using t-test.

(c) Test whether the height of children and parents have the same mean value at

↵ = 0.05 level of significance. What if there is a “pairing” between the height

of the child and parent?

(d) In order to understand how parent’s height e↵ect a child’s height, first obtain a

scatter plot for child against parent, then obtain the Nadaraya-Watson kernel

estimator with the choice of two di↵erent kernels by implementing NadarayaWatson

Kernel Regression analysis.

(e) Test whether the spread of heights for the “parent” group and “child” group

are the same or not.

2.2 This question should be answered using the “Carseats” data set.

(a) Test whether Sales follow normal distribution.

(b) Fit a multiple regression model to predict Sales using Price, Urban, and US.

(c) Provide an interpretation of each coecient

in the model. Be careful some of

the variables in the model are qualitative!

2

(d) Write out the model in equation form, being careful to handle the qualitative

variables properly.

(e) For which of the predictors can you reject the null hypothesis H0 : j

= 0?

(f) On the basis of your response to the previous question, fit a smaller model that

only uses the predictors for which there is evidence of association with the

outcome.

(g) How well do the models in (b) and (f) fit the data?

(h) Using the model from (f), obtain 95% confidence intervals for the coecient(s).

(i) Is there evidence of outliers or high leverage observations in the model from (f)?

(j) There is an indicator “Urban” in the “Carseat” data set, compare the mean

Sales of the “Urban” area with that of the “Rural” area, show the results of

the likelihood ratio test and the Mann-Whitney test for testing the equality of

these two mean values. Can we use the Wilcoxon’s Signed-Rank test? Why?

2.3 This question should be answered using the weekly.csv data set.

(a) Produce some numerical and graphical summaries of the Weekly data. Do there

appear to be any patterns?

(b) Use the full data set to perform a logistic regression with Direction as the

response and the five lag variables plus Volume as predictors. Use the summary

function to print the results. Do any of the predictors appear to be statistically

significant? If so, which ones?

(c) Compute the confusion matrix and overall fraction of correct predictions. Explain

what the confusion matrix is telling you about the types of mistakes made

by logistic regression.

(d) Now fit the logistic regression model using a training data period from 1990 to

2008, with Lag2 as the only predictor. Compute the confusion matrix and the

overall fraction of correct predictions for the held out data (that is, the data

from 2009 and 2010).

2.4 The “galaxies” data set from MASS package the velocities of 82 galaxies from six

well-separated conic sections of space (Postman et al., 1986, Roeder, 1990). The

data are intended to shed light on whether or not the observable universe contains

superclusters of galaxies surrounded by large voids. The evidence for the existence of

superclusters would be the multimodality of the distribution of velocities. Construct

a histogram of the data and add a variety of kernel estimates of the density function.

Estimate the density function of the “galaxies” data using histogram smoothing,

and uniform, Epanechnikov, biweight, and Gaussian kernels. What do you conclude

about the possible existence of superclusters of galaxies?

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp

- 代写econ426留学生作业、代做monetary Economics作业、 2020-02-28
- Anlt 207作业代写、代写python实验作业、代做analysis课程 2020-02-28
- Csci 4152作业代做、代写computer Science作业、Pyt 2020-02-28
- 代写comp 4200/5430作业、代做data课程作业、代写python 2020-02-28
- 代做se 3314B留学生作业、代做java编程语言作业、代写python， 2020-02-28
- Iy2840留学生作业代做、代写threat Detection作业、代写r 2020-02-28
- Q-Learning作业代做、代写python课程作业、代做network作 2020-02-28
- 代写mixed Models作业、Python编程语言作业调试、Java，C 2020-02-28
- 代写comp 250作业、代做java编程设计作业、代写java语言作业、代 2020-02-28
- Algorithm留学生作业代做、代写python语言作业、代做data课程 2020-02-28
- Comp 2406作业代做、Java编程语言作业调试、代写java实验作业、 2020-02-26
- 代写data留学生作业、代做program课程作业、代写java，C/C++ 2020-02-26
- 代做ubgmw9-15-3作业、代写matlab实验作业、代做civil E 2020-02-25
- 代做comp 2406作业、代写java语言作业、代做programming 2020-02-25
- 代做openmp留学生作业、代写system课程作业、代做c/C++实验作业 2020-02-24
- Cs 5100作业代写、Prolog Program作业代做、代写pytho 2020-02-24
- 代做programming作业、代写c++语言作业、C++编程设计作业调试、 2020-02-24
- Program 1 Ics-33: Intermediate Program... 2020-02-23
- Masters Programmes In Communicationsso... 2020-02-23
- Mech 203 Week 4 2020-02-23