首页
编程语言
数据库
网络开发
Algorithm算法
移动开发
系统相关
金融统计
人工智能
其他
首页
>
> 详细
讲解 EN.553.413-613, Spring 2024 EN.553.413-613, Spring 2024 Exam 1讲解 留学生SQL语言
Applied Stats and Data Analysis EN.553.413-613, Spring 2024 Feb 21, 2024 Exam 1
Question 1 (18 pts). The following TRUE/FALSE questions concern the Simple Linear Regression model Yi = β0 + β1Xi + εi , E(εi) = 0, V ar(εi) = σ 2 , cov(εi , εj ) = 0, for i = j. (a) TRUE or FALSE. For the least squares estimates b0, b1 we require the errors to be normally distributed. (b) TRUE or FALSE. The estimated mean of the response variable at Xi is defined as b0 + b1Xi . (c) TRUE or FALSE. One of the Gauss Markov conditions is P n i=1 ei = 0. (d) TRUE or FALSE. Plotting e 2 i vs Yˆ i is one of the diagnostic plots. (e) TRUE or FALSE. QQ plot of the Yi ’s is one of the diagnostic plots. (f) TRUE or FALSE. Low R2 means that X and Y are not related. (g) TRUE or FALSE. The s 2 is an estimate of the variance of Yi . (h) TRUE or FALSE. Coefficient of simple determination R2 measures the proportion of the explained variation in Y over the unexplained variation in Y . (i) TRUE or FALSE. In the Correlation model of the regression Xi ’s are random variables.
Question 2 (18 pts). Let X, Y, Z ∼ iid N(0, 1), i.e. they are independent, identically distributed standard normal random variables. For the following random variables state whether they follow a normal distribution, a t- distribution, a χ 2 distribution, an F distribution, or none of the above. State relevant parameters (e.g. degrees of freedom, and means and variances for normal RVs)
(a) 3Y − Z (b) X + Y + Z. (c) X2 + Y 2 + Z 2 . X2 + Y 2 (d) 2Z2 X2 (e) √ Y 2 + Z2 (X + Y ) 2 (f) 2
Question 3 (20 pts). Suppose a data set {(Xi , Yi) : 1 ≤ i ≤ n} is fit to a linear model of the form. Yi = β0 + β1xi + εi where εi are independent, mean zero, and normal with common variance σ 2 . Here we treat Y as the response variable and X as the predictor variable. The output of the lm function is given. Some values are hidden by ‘XXXXX’. We provide you with additional value: X¯ = 1.11. Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.9412 0.4593 4.226 0.000508 *** x 0.7042 0.3697 1.905 0.072911 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.9221 on 18 degrees of freedom Multiple R-squared: 0.1678, Adjusted R-squared: ----- F-statistic: XXXXX on XX and XX DF, p-value: XXXXXX (a) (2 points) How many data points are there (what is n, the sample size)? What is the estimated mean of the response variable Y at Xh = 2 for this dataset? (b) (3 points) Based on all of this output, do you reject H0 : β1 = 0 in favour of Ha : β1 = 0 at level α = 0.05 significance? What does the test tell us about the relationship between X and Y ? (c) (3 points) Based on all of this output, do you reject the H0 : β1 = 0 vs Ha : β1 > 0 at level α = 0.05 significance? Briefly explain why, or why not. (d) (4 points) The degrees of freedom, the p-value and the value of the F statistic are hidden. Is it possible to reconstruct all of them based on the data shown? Recover as many values as you can. (e) (4 points) Based on the data above find SSTo, SSR and SSE. Hint: Residual standard error may be useful here. (f) (4 points) Find the 95% confidence interval for the mean of the response function at Xh = 2. Write your answer in the form. A ± B · t(C, D), specify values A, B, C, D as precise as you can (i.e. find values of as many terms as you can).
Question 4 (14 pts). Consider the following diagnostic plots for two models (Model 1 and Model 2). Two simple linear regression models Y = β0 + β1X + ε are fitted to the two different datasets (X, Y ) observations of each Model. For each model 3 diagnostic plots are shown: plot of Yi vs Xi , plot of semi-studentized residuals e ∗ i versus fitted values Yˆ i , QQ-plot of the semi-studentized residuals e ∗ i .
(a) (5 points) What is the main issue do you diagnose with the Model 1, if any? Why? Which plot was the most useful in diagnosing this problem? Be as specific in describing the issue as you can. (b) (5 points) What is the main issue do you diagnose with the Model 2, if any? Why? Which plot was the most useful in diagnosing this problem? Be as specific in describing the issue as you can. (c) (4 points) This question is unrelated to the above plots. Explain in what cases the transformation of the predictor variable X is more appropriate than the transformation of the response variable Y .
Question 5 (20 points). For the dataset of n = 200 observations a simple linear regression model Yi = β0 + β1Xi + εi is fit. The following estimates are obtained. b0 = 2, b1 = 1 We have listed additional information here
(a) (2 points) What is the estimated variance s 2 of the error term based on the data above? (b) (3 points) Find a 90% confidence interval for β1. Write it in the form. A ± B · t(C, D), compute values of A, B, C, D if possible. (c) (4 points) Find the joint confidence intervals with confidence at least 90% for β0, β1 in the form. Ai ± Bi · t(Ci , Di). Compute values of Ai , Bi , Ci , Di if possible. Without any computation how does the interval for β1 for this part compare to the one in part (a)? (d) (4 points) Find the joint confidence intervals using Bonferroni procedure with confidence at least 90% for the mean of the response variable at Xh = 2 and Xh′ = 0. Find it in the form. Ai ± Bi · t(Ci , Di). (e) (4 points) Set up a General Linear Test for the data provided: specify the reduced and full model, compute the value of the F-statistic, specify its distribution under the null hypothesis. (f) (3 points) An Aspiring Data Scientist (ADS) noticed that one of the observed data points (Xi , Yi) = (2, 15) lies outside of the 99% Working-Hotelling band (we assume everything was computed correctly). They claim it is an issue. Briefly justify if their concern is correct or not.
Question 6 (20 points). Suppose Yi follows the model Yi = βXi + εi where εi is independent, identically distributed N(0, σ2 ). Note, there is no intercept term. You observe a collection {(Xi , Yi)} of data from this model, i = 1, . . . , n. (a) (5 points) Write the objective function to be minimized and the equations that need to be solved to get the least squares estimate of β. (b) (5 points) Solve the equation in (a) and express the answer as a linear combination of Yi ’s. (c) (5 points) What is the distribution of b? Find the mean, variance. Justify your steps (d) (5 points) Write the log-likelihood that needs to be maximized to obtain the estimate of β. DO NOT MAXIMIZE IT. (a) Function to be minimized
Equations to be solved:
(b) Solving the equation:
(c) Since b = P i ciYi , a linear combination of normal RVs, it will be a normal RV itself. The mean is
The variance is
We have shown that
(d) The log-likelihood is
联系我们
QQ:99515681
邮箱:99515681@qq.com
工作时间:8:00-21:00
微信:codinghelp
热点文章
更多
辅导 fn6806、讲解 c/c++,pyt...
2024-12-20
讲解 en.553.413-613, spring ...
2024-12-20
辅导 comp 330 (fall 2024): a...
2024-12-20
讲解 mat e 640 advanced ther...
2024-12-20
辅导 com398 systems security...
2024-12-20
辅导 en.553.413-613, spring ...
2024-12-20
辅导 math 104a midterm 1, fa...
2024-12-20
辅导 lpl assignment 2, 2024讲...
2024-12-20
讲解 csc 110 y1f fall 2024 q...
2024-12-20
讲解 7ssgn110 environmental ...
2024-12-20
辅导 clarkson lumber co – p...
2024-12-20
讲解 math 104a midterm 2, fa...
2024-12-20
辅导 eie373 microcontroller ...
2024-12-20
讲解 7ssgn110 environmental ...
2024-12-20
辅导 csc 110 y1f fall 2024 q...
2024-12-20
讲解 linear algebra - fall 2...
2024-12-20
辅导 compsci 4039 programmin...
2024-12-20
讲解 mathematical statistics...
2024-12-20
讲解 en.553.413-613, fall 20...
2024-12-20
辅导 mlf1002 economics asses...
2024-12-20
热点标签
mktg2509
csci 2600
38170
lng302
csse3010
phas3226
77938
arch1162
engn4536/engn6536
acx5903
comp151101
phl245
cse12
comp9312
stat3016/6016
phas0038
comp2140
6qqmb312
xjco3011
rest0005
ematm0051
5qqmn219
lubs5062m
eee8155
cege0100
eap033
artd1109
mat246
etc3430
ecmm462
mis102
inft6800
ddes9903
comp6521
comp9517
comp3331/9331
comp4337
comp6008
comp9414
bu.231.790.81
man00150m
csb352h
math1041
eengm4100
isys1002
08
6057cem
mktg3504
mthm036
mtrx1701
mth3241
eeee3086
cmp-7038b
cmp-7000a
ints4010
econ2151
infs5710
fins5516
fin3309
fins5510
gsoe9340
math2007
math2036
soee5010
mark3088
infs3605
elec9714
comp2271
ma214
comp2211
infs3604
600426
sit254
acct3091
bbt405
msin0116
com107/com113
mark5826
sit120
comp9021
eco2101
eeen40700
cs253
ece3114
ecmm447
chns3000
math377
itd102
comp9444
comp(2041|9044)
econ0060
econ7230
mgt001371
ecs-323
cs6250
mgdi60012
mdia2012
comm221001
comm5000
ma1008
engl642
econ241
com333
math367
mis201
nbs-7041x
meek16104
econ2003
comm1190
mbas902
comp-1027
dpst1091
comp7315
eppd1033
m06
ee3025
msci231
bb113/bbs1063
fc709
comp3425
comp9417
econ42915
cb9101
math1102e
chme0017
fc307
mkt60104
5522usst
litr1-uc6201.200
ee1102
cosc2803
math39512
omp9727
int2067/int5051
bsb151
mgt253
fc021
babs2202
mis2002s
phya21
18-213
cege0012
mdia1002
math38032
mech5125
07
cisc102
mgx3110
cs240
11175
fin3020s
eco3420
ictten622
comp9727
cpt111
de114102d
mgm320h5s
bafi1019
math21112
efim20036
mn-3503
fins5568
110.807
bcpm000028
info6030
bma0092
bcpm0054
math20212
ce335
cs365
cenv6141
ftec5580
math2010
ec3450
comm1170
ecmt1010
csci-ua.0480-003
econ12-200
ib3960
ectb60h3f
cs247—assignment
tk3163
ics3u
ib3j80
comp20008
comp9334
eppd1063
acct2343
cct109
isys1055/3412
math350-real
math2014
eec180
stat141b
econ2101
msinm014/msing014/msing014b
fit2004
comp643
bu1002
cm2030
联系我们
- QQ: 99515681 微信:codinghelp
© 2024
www.7daixie.com
站长地图
程序辅导网!