ECMT2150 INTERMEDIATE ECONOMETRICS, S1 2024
ASSIGNMENT
Due Date: 19 May 2024 (11:59pm sharp)
Instructions:
. Anonymous marking: Do NOT put your name anywhere on your uploaded do file. Identify yourself only by your student number.
. Answer all questions.
. A total of 100 points are available and marks for each question are indicated throughout.
. The assignment is worth 15% of your final grade for this UoS.
. You will need to use STATA (or another regression software program, e.g. R) to complete this assignment. Do not use Excel.
. Please read the information I provide on the next couple of pages carefully. I am available to discuss or may be happy to answer questions on the data and the context via Ed if anything is not clear.
Submission Instructions:
. Answers are to be submitted via the Canvas Quiz, “Assignment Quiz …”
. I encourage you to work through all of the data analysis following the questions in this document on Stata or another software package before heading to the quiz to answer the questions there. There are no trick questions, so if you have completed each of the following questions, kept a copy of your output and made a note of your answers, there will be no surprises when you are taking the quiz. You should not need to use Stata during the quiz at all. That said, the quiz is untimed, so you could leave and come back to the quiz if you need to.
. Remember – because it is an untimed quiz, it will NOT automatically submit at the due
date. You must click submit yourself.
. You will get only one attempt at the quiz.
. You must upload your Stata do file (or commands & output) in the final question of the Canvas quiz.
. This upload is worth 5 points.
. Think of this as a way of showing your working.
. If you used Stata, then just the content of your do file is all that is required. But you will need to copy it into a Word doc or save it as a pdf to upload it.
. If you did not use Stata, then you should upload a document (no longer than 4 pages) showing your commands and output.
Assignment: Multiple Linear Regression
Inference, Heteroskedasticity, Endogeneity and IVs
The topic and information on the dataset
This assignment involves the application of a range of econometric methods to analyse the hypothesis that the protection of property rights is conducive to economic growth and development, and so should positively affect the level of contemporary per capita income. This hypothesis is the subject of a sizeable literature going back to at least Acemoglu, Johnson, and Robinson (AJR) (2001). Indeed, it is the research question considered by AJR.
AJR and many other economists argue that rich countries are rich primarily because they have “ institutions” which are more conducive to economic growth and development. “Institutions” refers to a wide set of political and economic arrangements, including democracy versus autocratic rule, the security of property rights and the enforcement of law and contracts.
You have a dataset** that is a sub-sample of countries studied by AJR. AJR collected and collated this dataset. The data contains per capita income (GDP) in 1995 (in logs) as well as a variable indicating the protection of property against expropriation (with larger values indicating more protection). There are also a number of other variables in the dataset. The sample is of 61 non-European countries. Details of each variable are provided in a table below.
The data set is named ‘ MiniResearchProj.dta’.
. Download the data from theAssignment tabin our Canvas site.
**Note: I have created a few different versions of the data and each student will have a link to just one of these. I have edited the data slightly for each version, but by enough that you need to work on your own data. If you work on one of your classmate’s data sets, you may answer one or more questions in the quiz incorrectly and lose marks or be referred to the academic integrity office.
References *you are not required to read these, but they could be helpful/interesting.
Acemoglu, Johnson and Robinson (2001) “The Colonial Origins of Comparative Development:
An Empirical Investigation”, The American Economic Review, 91 (5): 1369-1401. Available at:
https://www.aeaweb.org/articles?id=10.1257/aer.91.5.1369
Albouy (2012) “The Colonial Origins of Comparative Development: An Empirical Investigation:
Comment.” The American Economic Review, 102(6): 3059–76.
Available at:https://www.jstor.org/stable/41724681(via the library)
Angrist and Pischke (2015), Chapter 3 in Mastering ‘Metrics: The path from cause to effect, Princeton University Press.
. Library link:
https://sydney.primo.exlibrisgroup.com/permalink/61USYD_INST/1c0ug48/alma991032063095205106
More info on the variables and data
The data is a cross-section of 61 countries. There are 61 rows – one for each country - and 14 columns. The columns correspond to the variables:
Variable name Description
longname full country name
shortnam 3 letter country name
mort Original Settler Mortality Rate per thousand
AJR state: “we use… the mortality rates of soldiers, bishops, and sailors
stationed in the colonies between the seventeenth and nineteenth centuries, largely based on the work of the historian Philip D. Curtin. These give a good indication of the mortality rates faced by settlers”
lnmort Log of Original Settler Mortality
proprights An index of protection against expropriation
The index takes values from 0 to 10 where a higher score indicates stronger property rights, or in other words more protection against expropriation.
lngdp Log GDP per capita (PPP) in 1995
PPP means the GDP is adjusted for Purchasing Power Parity.
latitude Absolute value of latitude of the country
A measure of distance from the equator, scaled to take values between 0 and 1, where 0 is the equator
neoeuro A dummy variable equal to 1 if the country is a 'New-European' country (USA,
CAN, AUS, NZL) and 0 otherwise
asia A dummy variable equal to 1 if the country is in Asia and 0 otherwise
africa A dummy variable equal to 1 if the country is in Africa and 0 otherwise
aunz A dummy variable equal to 1 if the country is Australia or New Zealand and 0
otherwise
rainmin Minimum monthly rainfall in millimetres
meantemp 1987 Mean annual temperature in degrees Celsius
malaria Percent of population in the country living where malaria is endemic in 1994
Part A: Descriptive Statistics for the Sample [8 marks]
Quiz questions 1-4: [6 marks]
Investigate the distribution of the variables:
lngdp, proprights, latitude, mort, lnmort, africa, asia
For each, find the average, standard deviation, minimum, median and maximum of its sample distribution.
. In the quiz you will be asked to report selected summary statistics, rounded to 1 or 2 decimal places. Please note the instructions in each quiz question in the quiz with respect to the rounding.
Quiz question 5: [2 marks]
Pause, review and think about what you learn from these descriptive statistics.
. In the quiz you will be asked to briefly describe one useful, unusual or noteworthy thing you discovered from these descriptive statistics.
Part B: Simple & Multiple Regression Model - Estimation and Testing [23 marks]
Quiz question 6: [3 marks]
(1) Estimate the simple naïve regression model in (EQ.1):
lngdp = β0 + β1proprights + u (EQ. 1)
. In the quiz you will report selected coefficient estimates, standard errors and the R- squared, rounded to 4 decimal places.
Quiz questions 7-8: [4 marks]
What is the sign of your estimated slope coefficient? Based on these estimates, are more secure property rights associated with higher or lower GDP per capita?
Interpret the estimated slope coefficient from (EQ.1).
Quiz questions 9-11: [4 marks] Is the estimated slope coefficient in (EQ.1) significantly different from zero at the 1% level of significance?
. In the quiz, you will not need to set out all the steps of the hypothesis test, but you will need to write down
o the hypotheses for the test,
o report the p-value, and
o report whether it is or is not statistically significant.
In your quiz answers, writing H0 and H1, beta1, beta1hat, etc is fine – you are not required to use subscript formatting or typeset maths in your quiz answers. But distinguishing between and using beta1hat or beta1 is important. To write not equal to 0, you can write it out in words, or write neq or not=.
Quiz question 12 [3 marks]:
Do you think the estimated slope coefficient in (EQ.1) is a causal estimate? Briefly explain.
Quiz question 13 [3 marks]:
Now, we will add some additional control variables to the model. Some scholars have argued for a direct effect of climate, as captured by the absolute value of the latitude at which the country is located (distance from the equator), on economic performance. Others suggest that we should also control for the broad geographic region in which each country is located. So, we will add three additional explanatory variables (latitude, africa and asia) to the model as shown here in (EQ.2):
lngdP = β0 + β1PTOPTigℎts + β2 latitude + β3 afTica + β4 asia + u (EQ. 2)
. In the quiz you will report selected coefficient estimates and the R-squared rounded to 4 decimal places.
Quiz questions 14-16: [5 marks]
Find the 99% confidence interval for the coefficient β1 on proprights in (EQ.2) and use it to determine the statistical significance of proprights in (EQ.2).
. In the quiz, you will report one of the bounds of the confidence interval, rounded to 3 decimal places.
o You can calculate this yourself – if so, be sure to make any calculations using all of the decimal places given in your Stata regression output. If you get your critical value from the tables, use the nearest degrees of freedom you can.
o Or, you can use a Stata command – check out the options on the command regress. To see all the options for the regress command, type help regress, in the Stata command window.
. Using your confidence interval, is proprights statistically significant in EQ.2 at the 1% significance level? (Yes/No)
. State how you used the confidence interval you calculated in order to determine whether proprights is statistically significant at the 1% significance level.
Quiz question 17: [1 mark]
Based on your estimated results for (EQ.2), are more secure property rights associated with higher or lower GDP per capita?
Part C: Heteroskedasticity [11 marks]
Quiz questions 18-22: [7 marks]
Apply the Breusch Pagan (BP) test for the presence of heteroskedasticity to model (EQ.2), using a 5% significance level. What do you conclude?
. Please use an F-test for your test. For full marks, you must conduct all the steps of the test as per the lecture notes or as described in the textbook.
. In the quiz you will:
o report selected coefficient estimates and the R-squared from your auxiliary regression each to 4 decimal places,
o report the test statistic, the degrees of freedom and either the critical value or the p- value for the test,
o provide the conclusion from your test.
Quiz questions 23-24 [4 marks]:
Re-estimate (EQ.2) but now with heteroskedasticity-robust standard errors.
. In the quiz you will report selected coefficient estimates rounded to 4 decimal places and compare the regular and robust standard errors.
** Please use robust standard errors from this point forwards **
|
Part D: Endogeneity [15 marks]
proprights is likely endogenous in the models we have estimated so far (EQ.1 and EQ.2) despite the additional explanatory variables we included.
Quiz question 25: [4 marks]
Given this endogeneity, does the multiple regression model in (EQ.2) capture a causal relationship between the protection of property rights and the economic performance of a country? Why or why not? What does this imply about E(u |proprights)?
Quiz question 26: [8 marks]
There are a number of reasons why an institution such as property rights is likely endogenous in the model we have estimated. Explain two (2) possible sources of this endogeneity. Your explanations should be clear, careful, and intuitive.
If you are stuck, consider taking a look at the paper by Acemoglu, Johnson and Robinson (2001).
Quiz question 27: [3 marks]
What is the impact of the endogeneity of the variable proprights in (EQ.2) on your estimates and inference if you estimate model EQ.2 using OLS?
Part E: Instrumental Variables [38 marks]
Quiz question 28: [3 marks]
The variable lnmort provides a potential instrumental variable (IV) we could use to cleanly identify the causal effect of the property rights institution on economic performance. What two key conditions must the IV satisfy in order for the IV estimator to be consistent? State whether each of these conditions can be tested.
Quiz question 29: [4 marks]
Briefly discuss whether, and why or why not, we could expect the IV, lnmort, to satisfy these conditions that you gave in Question 28. To do this, use intuition or simple economic theory.
. If you are stuck, consider taking a look at the paper by Acemoglu, Johnson and Robinson (2001).
Quiz questions 30: [3 marks]
Now – work with EQ1 again.
Estimate the first stage equation if we are going to use lnmort as an IV for proprights in (EQ.1).
. Use robust standard errors.
. In the quiz you will report selected coefficient estimates to 4 decimal places.
Quiz questions 31-32: [3 marks]
Using the estimation results from your first stage that you estimated for Question 31, test the relevance of the IV, lnmort (also known as a test of identification). Use a 1% level of significance.
. In the quiz, you will
o report the test statistic to 2 decimal places and
o select the correct formal conclusion for your test (MCQ).
Quiz questions 33: [4 marks]
Re-estimate model (EQ.1) by 2SLS using lnmort as an IV for proprights.
. In the quiz you will report selected coefficient estimates and standard errors to 4 decimal places.
Quiz question 34: [3 marks]
Using the results you obtained in Question 33, interpret the 2SLS-IV estimate for 1, the coefficient on proprights.
Quiz question 35: [4 marks]
Comment on the differences between the 2SLS-IV and OLS estimates and their standard errors for 1, the coefficient on proprights, in (EQ.1). For reference – these are the estimates from Question 33 and Question 6, respectively. When we used OLS, did we over or underestimate the effect of more secure property rights on the economic performance of a country?
Quiz question 36: [1 mark]
From the 2SLS-IV estimates for (EQ.1) - do more secure property rights have a negative or positive effect on GDP per capita?
Quiz question 37: [4 marks]
Imagine you are an adviser to the UN, World Bank or IMF. What has your research shown and what is your policy advice about the importance of well-protected property rights? Write 2-3 sentences.
. You should assume that the assumptions required for the IV method to be valid hold when answering this question.
Quiz question 38: [3 marks]
Now estimate (EQ.2) by 2SLS-IV. Also estimate the first stage.
. Use robust standard errors.
. In the quiz you will report selected coefficient estimates and standard errors to 4 decimal places.
Quiz question 39: [2 marks]
Consider the relevance of the instrument in the first stage for the 2SLS model for (EQ.2).
. In the quiz you will select the correct formal conclusion for a test of relevance (MCQ).
Quiz question 40: [4 marks]
Briefly comment on the reliability of the estimated effect of property rights when estimating
(EQ.2) by 2SLS-IV. Write no more than 3 sentences.
In formulating your answer, you may want to:
. consider the relevance of the instrument lnmort in the first stage, and
. consider and compare your 2SLS-IV estimates of the effect of property rights on economic performance from EQ.2 (from Question 38) and your 2SLS-IV estimates of EQ.1 (from Question 33).
Part F: Do file [5 marks]
Quiz question 41: [5 marks]
Upload your Stata do file in the final question of the Canvas quiz. See the directions on the first page of this document.