首页 >
> 详细

Homework 4

STAT 5511 (Spring 2020)

Charles R. Doss

Assigned: Sunday, March 22

Due: Monday, March 30

The usual formatting rules:

• Your homework (HW) should be formatted to be easily readable by the grader.

• You may use knitr or Sweave in general to produce the code portions of the HW. However, the output from knitr/Sweave that you include should be

only what is necessary to answer the question, rather than just any automatic output that R produces. (You may thus need to avoid using default R

functions if they output too much unnecessary material.)

– For example: for output from regression, the main things we would want to see are the estimates for each coefficient (with appropriate labels

of course) together with the computed OLS/linear regression standard errors and p-values.

• Code snippets that directly answer the questions can be included in your main homework document; ideally these should be preceded by comments

or text at least explaining what question they are answering. Extra code can be placed in an appendix.

• All plots produced in R should have appropriate labels on the axes as well as titles. Any plot should have explanation of what is being plotted given

clearly in the accompanying text.

• Plots and figures should be appropriately sized, meaning they should not be too large, so that the page length is not too long. (The arguments

fig.height and fig.width to knitr chunks can achieve this.)

1. Find (on Canvas) the file “hw5dat.rsav” (which can be loaded into R using load(“hw5dat.rsav”)).

It contains a time series (“xx”). The series is a “demeaned” monthly revenue stream (in millions of

dollars) for a company. There are n = 96 observations.

The series has been “demeaned”; usually that would mean we subtract off X¯ from every data point,

but pretend for now we know the mean µ exactly so we have subtracted off µ from every data point,

so the new series is exactly (theoretically) mean 0. (But thus its sample mean is not precisely 0.)

We will consider possible ARMA models for the series Xt

. We assume that the corresponding white

noise is Gaussian (so Xt

is Gaussian).

We will consider first an AR(2) model. We assume we know the true model exactly: it is

Model 1: Xt = .1.34Xtt 1 1 .48Xtt 2 + Wt

, Wt

iid∼ N(0, σ2

).

(a) Compute forecasts backcasts using Model 1, up to 25 time steps in the future and into the past.

Write code to do the prediction by hand (i.e., not using the predict() function). Plot the data,

forecast, and 95% prediction intervals [assuming gaussianity] (all on one plot). (Note: you do

not need to do a multiplicity correction for the prediction intervals.)

(b) Give a constant (nonrandom) number that the 100-step-ahead forecast, X100

196 , will be approximately equal to.

(c) If you were to do the one-step-ahead prediction but based on no data, what would your prediction

be? (Based on no data, it is the same for predicting at time 97 or at any other time.) What

would the mean-squared prediction error (call it E) be? Compare P

96

97 to E.

(d) Now say that we know the true mean of the company’s revenue series is .3 (million dollars).

Provide

i. a plot of the company’s (not-demeaned) revenue series (let’s call it Yt),

ii. and the prediction equation for the series Yt

. (Based on Model 1).

(e) The series Yt

is a monthly revenue stream for a company. The company needs to decide, before

the current month is up (i.e., before seeing the one-month-ahead revenue, Y97), whether to make

an important investment in equipment which will cost 1.1 million dollars. If their revenue next

month cannot cover the cost (is less than the cost of the investment) they will go bankrupt (they

have exactly 0 cash on hand and cannot take out loans). Explain why they should or should

not make the investment.

1

(f) The second model we will consider (Model 2) is an MA(2) model. Estimate an MA(2) model on

the Xt (demeaned) series using the arima() function (there is an include.mean variable; set it to

false). Then plot (on one plot): the data, forecasts up to 25 months ahead, and 95% prediction

intervals [assuming gaussianity]. You may use the predict() function.

(g) Compare the forecasts for Model 1 and Model 2: specifically, discuss how quickly the two

forecasts revert to the long run average of the series. Provide an explanation of this. [Note: I

am not intending for you to discuss the fact that in one case the coefficients were estimated and

in the other they were not estimated.]

(h) Now we consider changing the frequency of observation. Imagine someone outside the company

observes the company revenue but only on a quarterly basis, by which I mean every 3 months

(thus they observe March’s revenue, then June’s revenue, ...). Let this series of de-meaned

observations be Zt

. If the true model for Xt

is AR(1), Xt = φ1Xtt1 +Wt where Wt

iid∼ N(0, σ2

),

then what is the (true) model for Zt (including the distribution of the white noise series)?

2. Shumway and Stoffer (4th ed.), question 3.15

3. Shumway and Stoffer (4th ed.), question 3.20

4. A data analyst is analyzing a time series with 1000 observations. She fits an ARMA(2,1) model

(mean 0 i.e., with no intercept), yielding the following estimate output:

Coefficients:

ar1 ar2 ma1

-0.062 0.817 0.971

To check the robustness of the fit, the analyst removes the last 100 observations (leaving 900) and fits

the ARMA(2,1) model again. The analyst is surprised to see the following very different estimates

output:

Coefficients:

ar1 ar2 ma1

0.752 0.112 0.100

(a) Provide an explanation for why these different results were output.

(b) Provide a diagnostic tool or mechanism or method to assess the answer you gave in the previous

part, and explain how you would use it or what you would look for.

(c) Based on the two sets of estimates above and your answers to question (a), provide an estimated

model for the data. That is: write down a model, including names/symbols for any unknown

parameters, and provide an estimate for any unknown parameters. You do not need to estimate

[Note: your estimate(s) will just be rough estimate(s) based only on the R output presented

above. There are a variety of similar possible answers, all of which will get credit.]

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp2

- Tsp课程作业代写、代做algorithms留学生作业、代做java，C/C 2020-06-23
- Kit107留学生作业代做、C++编程语言作业调试、Data课程作业代写、代 2020-06-23
- Sta302h1f作业代做、代写r课程设计作业、代写r编程语言作业、代做da 2020-06-22
- 代写seng 474作业、代做data Mining作业、Python，Ja 2020-06-22
- Cmpsci 187 Binary Search Trees 2020-06-21
- Comp226 Assignment 2: Strategy 2020-06-21
- Math 504 Homework 12 2020-06-21
- Math4007 Assessed Coursework 2 2020-06-21
- Optimization In Machine Learning Assig... 2020-06-21
- Homework 1 – Math 104B 2020-06-20
- Comp1000 Unix And C Programming 2020-06-20
- General Specifications Use Python In T... 2020-06-20
- Comp-206 Mini Assignment 6 2020-06-20
- Aps 105 Lab 9: Search And Link 2020-06-20
- Aps 105 Lab 9: Search And Link 2020-06-20
- Mech 203 – End-Of-Semester Project 2020-06-20
- Ms980 Business Analytics 2020-06-20
- Cs952 Database And Web Systems Develop... 2020-06-20
- Homework 4 Using Data From The China H... 2020-06-20
- Assignment 1 Build A Shopping Cart 2020-06-20