首页 >
> 详细

MAST 397B: Introduction to Statistical Computing

ABSTRACT

Notes: (i) This project can be done in groups. If it is done

in a group, you have to submit the copy for the group

(not individuals). In this case the cover page must have all

the group members with their ID numbers along with a

statement of contributions of each member of the group.

(ii) You should present references to all materials (online

or otherwise) in your report. (ii) All the codes should be

put in an appendix. (iii) Answers should be clearly stated;

a not-well written report will get only partial credit.

Instructor: Yogen Chaubey

MAST 397B

FINAL PROJECT

Due Date: December 2, 2019

MAST 397B: Introduction to Statistical Computing

Final Project

Due Date: December 2, 2019 [Hard Copies only]

Problem 1. [20 Points]

Fitting distributions to a given dataset is an important problem in statistical analysis. R

contains a package called fitdistrplus that facilitates fitting various known continuous

distributions. In general fitting a distribution requires the knowledge of the form of the

distribution such as the Gaussian distribution given by the probability density function (pdf)

????(????) = 1 ????√(2????) ????????????{? 12????2 (???? ? ????)2}; ???? ∈ (?∞, ∞).

The vector ???? = (????, ????2) is known as the parameter vector and is estimated from a random

sample (????1, ????2, … , ????????). Consider the data named goundbeef, available with the package

fitdistrplus. Fit the following two distributions for this dataset (a) log-normal distribution

(b) Gamma distribution.

(i) Use the maximum likelihood (ML) method for the log-normal distribution and

method of moments (MM) for the Gamma distribution. Note that ???? is said to have

log-normal distribution if ???? = log ???? has a normal distribution and that the Gamma

pdf with shape parameter ???? and scale parameter ???? is given by

????(????) = 1 ????????Γ(????) ?????????1 exp{ ? ???????? }; ???? ≥ 0

Use a standard statistical text for explicit formulae in order to calculate these estimators

using your own defined function in R.

(ii) Use the package fitdistrplus to find the ML and MM estimators for the two

distributions.

(iii) One method of justifying a given distribution is to perform a Chi-square goodness-of?fit test. It is given by the test statistic

????2 = ?????????? ? ?????????2 ????????2 ????????=1

Here we assume that the data is grouped into k groups (???? = # ???????? ???????????????? ???????? ????????? ?????????????????????????????????) ,

???????? is the observed frequency in ????????? group and ???????? is the frequency in ????????? group under the fitted

model.

This has to be computed by the formula, ???????? = ????????????, ???????? is the probability of the observation

being in group ???? in the model. If the model fits, the test statistic ????2 has a Chi-square

distribution with df= ????=k-1-p where p= No. of estimated parameters.

Compute the ????2 statistic for the above data for a suitable value of ????; note that for the test to

be valid each group must have 5 or more observations. Find the upper 5% value of the

appropriate ????2 distribution and compare the computed value (for both the models) in

deciding if the models fit the data. [Note: The observed value of ????2 greater than 5% value of

χ2 with df= ???? indicates poor fit].

(iv) Quality of the fits may also be gauged by plotting the histogram with estimated

density super-imposed over it. Provide the histogram with the estimated density

super-imposed over it for both the methods for each of the log-normal and gamma

distributions and comment on the quality of the fit.

(v) Another qualitative method to judge the fit is the Q-Q plot of the data. Give the QQ

plots for both the methods for each of the log-normal and Gamma densities. Comment

on the quality of fit in each case. How does it compare with your conclusion in part

(iii).

Problem 2. [15 Points]

Problem 3 [10 Points]

Consider the following data from Example 7.12

(a)The objective is to determine a line ???? = ????0 + ????1???? such that the function

????(????0, ????1) = ? |???????? ? ????0 ? ????1????????| ????????=1

is minimized. Use optim( ) function of R with starting values obtained from lm( ).

(b) Plot the least square line and the line obtained in part (a) on the scatterplot and

comment on the fit of these lines to the data.

(c) Suppose another point (2.05,3.23) is added to the data. Compute the two lines again

and comment on the effect of the new point on the estimates.

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp

- Comp201作业代做、Software Engineering作业代写、J 2019-12-07
- Comp3322a作业代做、代写modern Technologies作业、 2019-12-07
- Cse315留学生作业代写、代做software Engineering作业 2019-12-07
- 代写cse403留学生作业、代做java程序语言作业、System课程作业代 2019-12-07
- Cse-381作业代做、代写canvas留学生作业、代做c++语言作业、C+ 2019-12-07
- Stat 315作业代写、Linear Relationship作业代写、代 2019-12-07
- Cs602留学生作业代做、代写programming课程作业、代做pytho 2019-12-07
- Math5714作业代做、代写linear Regression作业、R编程 2019-12-07
- Ista 116作业代写、Data留学生作业代做、代写r程序设计作业、R语言 2019-12-07
- 代做data留学生作业、代写r编程语言作业、代做r课程设计作业代写r语言编程 2019-12-07
- Sehs3321作业代做、代做web，Html编程语言作业、代写networ 2019-12-07
- Stat2005作业代写、代做r编程设计作业、代写programming课程 2019-12-07
- 代做data Set作业、代写python，Java编程语言作业、代做c/C 2019-12-07
- Cis 212留学生作业代做、代写c/C++编程设计作业、代做c/C++语言 2019-12-06
- 代做csi 403留学生作业、Data Structures作业代写、代做j 2019-12-06
- 代做bpi889留学生作业、代写r编程语言作业、R课程设计作业代做、Data 2019-12-06
- 代写website留学生作业、代做python程序设计作业、代写python 2019-12-06
- Comp201作业大写、代做software Engineering作业、代 2019-12-06
- Game Srs作业代做、代写linux Platforms作业、Java编 2019-12-06
- 代写stat 462/862作业、代做python编程设计作业、代写java 2019-12-06