首页 > > 详细

STA302H1F/STA1001HF: Mini Project 1

 STA302H1F/STA1001HF: Mini Project 1 Due on 30th July, 2021

11:59 PM Sharp on Crowd mark
Name Student Number
The mini project will be done independently. It will be used to develop your understanding of linear
regression properties as well as your R and coding skills which will be relevant for the final project. For the
mini project you will be asked to do the following:
#tinytex::install_tinytex()
#install.packages("tinytex")
• Install TinyTeX which is custom LaTeX distribution based on TeX. The R code for installation is
above.
• You will be required to submit the R markdown file with your PDF output. This will be important to
ensure that your script runs.
• Projects should be submitted on time (i.e. by the deadline). Late submissions will receive a 10%
penalty for each day that the project is late.
• In general, extensions will not be given unless a valid reason is provided. In such cases, an extension
of up to 5 days may be granted.
• There are no make-up mini projects. A missed mini project will be given a grade of 0.
Suppose we want to simulate the following linear model:
𝛽0 + 𝛽1𝑥𝑖1 + 𝛽2𝑥𝑖2 + 𝜖𝑖
where 𝜖𝑖 ∼ 𝑁(0, 22). Assume 𝑋𝑖 ∼ 𝑁(0, 12
), 𝛽0 = 0.5, 𝛽1 = 2 and 𝛽1 = 1.
1. Fix the sample at size 100. The x values are considered known quantities, so we will simulate those
first, and they remain the same for the rest of the assignment. Create values for 𝑥1
, 𝑥2 within the range
of 1 to 15. Find the correlation of your two predictor variables. Does the correlation make sense? You
may find the following function helpful:
#draw a random sample
sample(seq(a,b, length = n))
2. Linear Regression. Generate one realization of the observations. Create a data frame for your simula￾tions, and display the coefficients from your linear regression model (using the lm function) in a table
format.
1
3. Matrix algebra. Recall that the least squares estimate for multiple linear regression is given by:
𝛽 = (𝑋′𝑋)−1𝑋′𝑌
Show that this formula is equivalent to the simple linear regression equal, i.e. show that, for the model
𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝜖𝑖 we have that
𝛽 = (𝑋′𝑋)−1𝑋′𝑌 = (𝛽0 𝛽1
) = ( 𝑌 − 𝛽1𝑋 ∑𝑛
𝑖=1(𝑋𝑖−𝑋)(𝑌𝑖−𝑌 )
∑𝑛
𝑖=1(𝑋𝑖−𝑋)2 )
Show all the details of your proof.
Hint: The inverse of a 2 × 2 matrix is (
𝑎 𝑏
𝑐 𝑑) is (
𝑎 𝑏
𝑐 𝑑)
−1
= 1
𝑎𝑑−𝑏𝑐 (
𝑑 −𝑏
−𝑐 𝑎 )
4. Matrix Algebra 2. Create the 𝑋 matrix for the multiple linear regression model with two predictor
variables (i.e. 𝛽1
, 𝛽2
). Find the least squares estimate 𝛽, and 𝑠(𝛽) using matrices. Compare the
estimates from lm function in question 2 to the estimates you obtained here. What do you notice?
The following lines of codes helpful:
#inverse
solve(x)
#transpose
t(X)
#matrix multiplication
t(X) %*% X
5. Distribution theory. Fix the number of simulations at 1000. For each simulations obtain a least squares
estimates of 𝛽0
, 𝛽1
, 𝛽2
from the linear regression model (using the lm function). Calculate the mean
for each of the estimates of the multiple linear regression model (i.e. find the mean for 𝛽0, 𝛽1, 𝛽2
).
6. Distribtion theory 2. Plot the histogram for the estimates 𝛽0, 𝛽1, 𝛽2
. In the histogram include both the
actual values for 𝛽0
, 𝛽1
, 𝛽2 and the means of the estimates 𝛽0, 𝛽1, 𝛽2
. What do you notice about the
mean for each of the estimates in comparison to the true values of the parameters? Explain why the
results makes sense.
7. Confidence intervals. Fix the number of simulations at 1000. Compute the confidence intervals for the
parameter estimates 𝛽0, 𝛽1, 𝛽2
. Does each interval contain the true parameter? What is the coverage
probability?
You may find the following code useful:
alpha=0.05
#number of parameters
p=2
#t-quantile
qt(alpha, n-p-1)
2
#count for any condition
sum(ifelse(test_expression, x, y))
3
联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!