STAT7017 Final Project

STAT7017 Final Project Page 1 of 3

Big Data Statistics - Final Project (Deferred)
Total of 45 Marks
Due Friday 28 February 2020 by 23:59
Let ℤ denote the set of integers. A sequence of random vector observations (핏t
: t = 1, . . . , T)
with values in ℝp
is called a p-dimensional (vector) time series. We denote the sample mean and
sample covariance matrix by
The lag-τ sample cross-covariance (aka. autocovariance) matrix is defined as
√spp) and the values come from 핊0 = [sij]. Assuming
] = 0, some authors (e.g., [C]) omit 핏t and consider the symmetrised lag-τ sample cross￾covariance given by
Question 1 [12 marks]
Simulation is a helpful way to learn about vector time series. Define the matrices
Generate 300 observations from the “vector autoregressive” VAR(1) model
핏t = A핏tt1 + εt (1)
where εt ∼ N2(0, Σ), i.e., they are i.i.d. bivariate normal random variables with mean zero and
covariance Σ. Note that when simulating is it customary omit the first 100 or more observations
Also generate 300 observations from the “vector moving average” VMA(1) model
핏t = εt + Aεtt1. (2)
(a)[1] Plot the time series 핏t for the VAR(1) model given by (1)
(b)[1] Obtain the first five lags of sample cross-correlations of 핏t for the VAR(1) model, i.e.,
ρ1, . . . , ρ5.
(c)[1] Plot the time series 핏t for the MA(1) model given by (2).
Dale Roberts - Australian National University
Last updated: February 23, 2020
STAT7017 Final Project Page 2 of 3
(d)[1] Obtain the first two lags of sample cross-correlations of 핏t for the MA(1) model.
(e)[5] Implement the test from [A] and reproduce the simulation experiment given in Section 5.
This means you need to generate Table 1 from [A].
(f)[3] The file q-fdebt.txt contains the U.S. quarterly federal debts held by (i) foreign and
international investors, (ii) federal reserve banks, and (iii) the public. The data are from
the Federal Reserve Bank of St. Louis, from 1970 to 2012 for 171 observations, and not
seasonally adjusted. The debts are in billions of dollars. Take the log transformation and the
first difference for each time series. Let (핏t) be the differenced log series.
Test H0 : ρ1 = . . . = ρ10 = 0 vs Ha : ρτ = 0 6 for some τ ∈ {1, . . . , 10} using the test from
[A]. Draw the conclusion using the 5% significance level.
Question 2 [13 marks]
More generally, a p-dimensional time series 핏t follows a VAR model of order `, VAR(`), if
핏t = a0 +
X
`
i=1
Ai핏tti + εt (3)
where a0 is a p-dimensional constant vector and Ai are p × p (non-zero) matrices for i > 0, and
i.i.d. εt ∼ Np(0, Σ) for all t with p × p covariance matrix Σ.
One day you might want to “build a model” using the VAR(`) framework. One of the first things
you need to do is to determine the optimal order `. Tiao and Box (1981) suggest using sequential
likelihood ratio tests; see Section 4 in [B]. Their approach is to compare a VAR(`) model with a
VAR(` ` 1) model and amounts to considering the hypothesis testing problem
H0 : A` = 0 vs. H1 : A` 6= 0.
We can do this by determining model parameters using a least-squares approach. We rewrite (3)
is the inflation rate, in percentage, of the U.S. monthly
consumer price index (CPI). This data from the Federal Reserve Bank of St. Louis. The CPI
rate is 100 times the difference of the log CPI index. The sample period is from January 1947 to
December 2012. The data are in the file m-cpib3m.txt.
(a)[1] Plot the time series 핏t
.
(b)[6] Select a VAR order for 핏t using the methodology (described above).
(c)[6] Drawing on your results obtained in this project and the theory discussed in class, explain
and demonstrate (e.g., simulation study) what might happen with this methodology if the
dimensionality p of the time series becomes large.
Question 3 [20 marks]
The recent paper [C] is concerned with extensions of the classical Marchenko-Pastur to the time
series case. Reproduce their simulation study which is found in Section 5 and Figure 1.
References
[A] Li, McLeod (1981). Distribution of the Residual Autocorrelations in Multivariate ARMA Time Series Models, J.R.
Stat. Soc. B 43, No. 2, 231–239.
[B] Tiao and Box (1981). Modelling multiple time series with applications. Journal of the American Statistical
Association, 76. 802 – 816.
[C] Liu, Aue, Paul (2015). On the Marchenko-Pastur Law for Linear Time Series. Annals of Statistics Vol. 43, No. 2,
675–712.
[D] Wang, Aue, Paul (2017). Spectral analysis of sample autocovariance matrices of a class of linear time series in
moderately high dimensions. Bernouilli 23(4A), 2181–2209.
Dale Roberts - Australian National University
Last updated: February 23, 2020