讲解 ACC207 Econometrics in Accounting & Finance Spring semester 2024讲解 R语言

ACC207 Econometrics in Accounting & Finance

ASSESSED PROJECT BRIEF

Spring semester 2024

MUST-READ Instructions:

1. This coursework is the sole and final assessment for the module ACC207, and represents 100% of the final grade for the module.

2. This coursework contains two sections, A and B, with each section carrying 100 marks. Your final grade will be the average mark of the two sections. Thus, sections A and B carry equal weight. All parts of both sections must be completed.

3. The answers to all questions should be submitted in a single Word document with clear question numbering (i.e.,A.1,A.2…, B.1, B.2…). All R codes MUST be reported in your solution document. Short pieces of R code should be reported directly in your answer to each question. If your code is long, you may place the code in a numbered appendix at the end of the document, and clearly refer to the appendix number in the main text. The R codes should clearly indicate the question numbers to which they relate. Failure to report your R codes in the report or appendix will result in a low mark.

4. Deadline: Sunday 2nd June at 23:59 (to be submitted on LMO).

5. Submission: rename your file with your FULL name (in Pinyin/English letters:

given name followed by surname) and student ID. (e.g., San Zhang 123456)

Submit one Microsoft Word file using the relevant submission link on LMO.

6. In case of questions, contact one of the module staff well before the deadline:

7. All answers must be given in English.

8. It is your responsibility to retain a copy of your solution file, in case the submitted file is corrupted.

Section A: Simulation, Multiple Linear Regression and Hypothesis Testing (50%)

Question A1:

Suppose you are given a sample of 100 observations on two variables,

X & Y. You want to estimate the relation:

yi = β0 + β1xi + εi (1)

Your main interest is in estimating the slope coefficient, β1 .

It has been suggested that, rather than estimating Eq. (1) on all 100 observations, it might be better to split the sample into two groups of 50 observations, and estimate Eq. (1) on each of these. If the groups

are called Group A and Group B, then it is proposed that the estimator:

(2)

might be a ‘better’ estimator than β(̂)1 for β1 .

Required:

Which estimator will be better, the OLS estimator β(̂)1 , or the new

estimator ?

Make your choice by using simulation in R, based on the hypothesised

relation:

yi = 180 + 40Xi + εi

where 0 ≤ X ≤ 80; εi ~N(0, 20000) (3)

Simulate 1,000 samples of 100 observations, and use these simulated samples (and any further necessary analysis) to determine which estimator is better.

In your solution, you should explain your logic, and include the R code that you used in generating results. You may also give graphical material, if you deem it helpful in answering the question.

Your simulation-based answer may also be supported by relevant algebraic analysis, if you wish (but should not be replaced by algebraic

analysis).

(Total 50 marks for Question A1)

Question A2:

You have been provided with a dataset in the Excel file QuestionA2.xlsx. The dataset contains observations on three variables relating to a sample of listed companies for the year 2023: MktValue,

BkValue, and RI.

The definitions of the variables are as follows:

MktValue = the market value of the company (debt plus equity);

BkValue = the book value of the company (shareholders’ equity

accounts plus book value of debt);

RI = the residual income of the company for the year.

All MktValue and BkValue values are year -end values for the year 2024; all RI values relate to the 12 -month period for the relevant company, ending on 31st December 2023. You may assume that all companies have the same accounting year-end date.

You are interested in studying the relationship among the three variables. It has been suggested that the following relationship may

hold, on average, among the variables:

Mktvalue = Bkvalue + Pv(Future Residual Income)

It has also been suggested that current residual income might be a good indicator of future residual income for a company, so that:

Pv(Future Residual Income) = YRI

for some constant Y.

You are thus interested in estimating the following regression

equation, using OLS regression:

Mktvaluei = β0 + β1 Bkvaluei + β2 RIi + εi

where:

i= index representing the firm;

εi = stochastic error term, assumed normally distributed with zero

expected value.

Theory suggests that the coefficient on BkValue, β1 , should have a value of unity (i.e., +1). Your colleague Bob, an investment analyst, thinks that a likely capitalisation rate for residual income is 5 (i.e., she

thinks that β2 = 5.

Required:

(a) Estimate, using OLS, the regression:

Mktvaluei = β0 + β1 Bkvaluei + β2 RIi + εi

Give a full interpretation of the regression results, which you should obtain from R. Show clearly, in the text of your report, the R code you used to perform. the estimation. Perform. all relevant regression diagnostics. If you think there are any potential problems with the model, based on your regression diagnostics, then discuss these here. (You can ignore heteroskedasticity and autocorrelation in this part of the question; these will be dealt with in part (d) below).

Now, for parts (b) and (c), assume that the regression in part (a) is valid and well-specified. (20 marks)

(b)Test the restriction that β1 = 1. Separately test the restriction

that β2 = 5. (10 marks)

(c) Test the joint hypothesis that β1 = 1 and β2 = 5. (10 marks)

(d) Are there any issues with autocorrelation or

heteroskedasticity in this model? If not, discuss how you know this. If so, how can the problem(s) be fixed? (10 marks)

(Total 50 marks for Question A2)

Section B Regression with Time Series Data (50%)

Download the HSEINV dataset and answer the following questions.

The data in HSEINV are annual observations on housing investment and a housing price index in the United States for 1947 through 1988. Let invpc denote real per capita housing investment (in thousands of dollars) and let price denote a housing price index

(equal to 1 in 1982).

1. A simple regression in constant elasticity form, which can be

thought of as a supply equation for housing stock, gives,

log(invPct ) = β0 + β1 log(PTicet ) + μ

Estimate the above linear regression model by creating two new

variables, log(invPc) and log(PTice), in the dataset.

Report your regression results table using the following format

and interpret the economic meaning ofβ(̂)1 . (15 marks)

	Estimate	Standard Error	T Value	p-value
(Intercept)
Varname 1
Varname 2
…

2. Carry out the following t-test and draw a conclusion using two methods. Set the significance level to be α = 5%. The two

methods are quantile value method and p-value method.

H0 : β1 > 1

H1 : β1 ≤ 1

(15 marks)

3. Interpret the meaning of the p-value results in the above t-test and discuss why one would reject or fail to reject H0 of the above t-test based on the p-value? (Simply stating that the p- value is larger or less than the significance level is NOT acceptable). (10 marks)

4. Plot the log(invpc) series and log(price) series against time (year). Show your plot and discuss what you observe. (You should plot these two series in one figure with proper range, different colours and clear legend.) (15 marks)

5. Define one new variable t, where tis the index of year.

Include the linear trend tin the linear regression model.

Estimate the following linear regression model using OLS.

log(invpct ) = β0 + β1 log(pricet ) + β2 t + μ

Report your results table and compare your results with those in Question 1. What do you observe and why does such a difference exist? (15 marks)

6. Detrend the log(invpct ) and log(pricet ) series using t. Rename the detrended series as de loginvpc and de logprice. Report your detrending regression results in a table as before. What do you conclude? You can also try higher order of t(i.e.,

t 2 and t3 ) in detrending. (15 marks)

7. Re-estimate the regression model in Question 5 but now using de loginvpc as the dependent variable and de logprice as the independent variable. No changes should be made in the other variables.

Compare your results with those in Question 5. What do you conclude? (15 marks)