Statistical Project, Semester 1, 2018
ECON7300: Statistics for Business and Economics
Instructions for Dataset 3: Simple Regression Analysis (40 marks)
A statistic lecturer wanted to test the theory: the longer one studied, the better one’s
grade. Accordingly, he took a random sample of 100 students and asked each student to
report the average amount of time he or she studies statistics and the final mark received.
The data file dataset3.xls contain data of 100 students and the variables in the dataset
are:
Mark (Y, final mark in points (out of 100))
Time (X, study time in hours)
The dependent variable for your analysis is Mark.
Answer the following questions using dataset 3.
(a) Create a scatter plot of Y on the vertical axis against X on the horizontal axis
(make sure to label each axis). (1 mark)
(b) Using the scatter plot in (a), does there appear to be a relationship between Y and
X? If so, in what direction is the relationship? How strong is the relationship? (2
marks)
(c) Estimate a simple regression model using X to predict Y (present the output and
write out the estimated equation). (3 marks)
(d) Interpret the intercept coefficient. Does it make sense? (3 marks)
(e) Interpret the slope coefficient. (2 marks)
(f) Compute the coefficient of determination and interpret its meaning. (3 marks)
(g) Compute the standard error of the estimate and interpret its meaning. Judge the
magnitude of the standard error of the estimate. (4 marks)
(h) Perform. a residual analysis (plot the residuals) and evaluate whether the
assumptions of regression have been violated. (6 marks)
(i) Test for the significance of the slope coefficient using t test (follow all the
necessary steps). Assume 5% level of significance. (4 marks)
(j) Test for the slope using F test (follow all the necessary steps). Assume 5% level of
significance. (4 marks)
Statistical Project, Semester 1, 2018
2
(k) Compute a 95% confidence interval estimate of the mean Y for all students when
X = 24 and interpret its meaning. (4 marks)
(l) Compute a 95% prediction interval of Y for a student when X = 24 and interpret its
meaning. (4 marks)
Instructions for Dataset 4: Multiple Regression Analysis (40 marks)
The data file dataset4.xls contains information on monthly earnings, experience, IQ
scores and race for 855 men.
The variables in the dataset are:
Wage (monthly earnings in dollar)
Experience (years of work experience)
Black (1 if black and 0 if nonblack)
IQ (IQ score)
The dependent variable for your analysis is Wage.
Answer the following questions using dataset 4.
(a) Estimate a regression model using Experience and IQ to predict Wage (present the
output and write out the estimated equation). (2 marks)
(b) Interpret the slope coefficients. (2 marks)
(c) Predict Wage when Experience = 24 and IQ = 150. Comment on this prediction.
(2.5 marks)
(d) Plot the residuals to test the assumptions of the regression model. Is there any
evidence of violation of the regression assumptions? Explain. (8 marks)
(e) Determine the variance inflation factor (VIF) for each independent variable
(Experience and IQ) in the model. Is there reason to suspect the existence of
collinearity? (2 marks)
(f) Are all individual variables significant at the 5% level? Use t tests and follow all the
necessary steps. (5 marks)
(g) Test for the significance of the overall multiple regression model at 5% level of
significance. (2.5 marks)
(h) Estimate a regression model using Experience, IQ and Black to predict Wage
(present the output, and write out the multiple regression equation, the regression
Statistical Project, Semester 1, 2018
equation for blacks and the regression equation for nonblacks). Interpret the
coefficient for Black. (6 marks)
(i) Test the hypothesis that there is wage discrimination against blacks at the 5% level.
(2.5 marks)
(j) Estimate a regression model using Experience, IQ, Black, and an interaction
between Black and Experience to predict Wage (present the output and write out
the estimated equation). Interpret the coefficients of Experience and the interaction
term. (5 marks)
(k) Test whether the return to experience depends on race at the 5% level. (2.5 marks)