EC395 Applied Econometrics

Spring 2020

Individual STATA Assignment # 1

In this assignment, you will use a sample dataset on democracy and a variety of health outcomes (as

used in Besley and Kudamatsu, 2006) to explore the relationship between health, income and

democracy. The assignment is due by 11PM, Friday July 3

rd

.

I have posted ten different datasets on MyLS. You are to download and use the version of the dataset

that corresponds to the second-last digit in your student number. Each dataset contains the following

variables:

1. country – a variable that identifies the country that the observation is related to

2. year – a variable that identifies the year that the observation is related to

3. lifeexp – the life expectancy at birth (in years) for a given country in a given year

4. infmort – infant mortality per 1000 live births for a given country in a given year

5. imm_dpt – percent of children aged 12-23 months that have received DPT immunization in a

given country in a given year

6. imm_msl - percent of children aged 12-23 months that have received measles immunization in

agiven country in a given year

7. sanitation – percent of population that has access to improved sanitation facilities in a given

country in a given year

8. income – GDP per capita of a given country in a given year

9. democstm4 – a contemporaneous measure denoting the fraction of democratic years between

the years t - 4 and t

10. region – a variable that identifies the global region that a given country belongs to (eap = East

Asia and Pacific, eca = Eastern Europe and Central Asia, lac = Latin America and the Caribbean,

mena = Middle East and North Africa, ssa = Sub-Saharan Africa).

By the assignment deadline, you are to upload a do file, and a report that provides answers to the

questions below. The report needs to include the tables and figures that you generate. All tables and

figures should be clearly labelled and easy to read. The do file should perfectly replicate all aspects of

your analysis, including loading the dataset. Upload your files to the dropbox folder on MyLS by 11 pm

on Friday, July 3

rd

.

1. Life expectancy as the dependent variable and income as the independent variable: Determine

which of the following models describe the relationship between life expectancy and income

best: (i) simple linear regression model, (ii) quadratic regression model, and (iii) linear-log

model. You should use both scatterplots and regression results to support your argument. [3

marks]

2. Life expectancy as the dependent variable: Create a single table to include results from the

following regressions: [5 marks]

(i) Democracy since t-4 as the explanatory variable, without country fixed effects

(ii) Log of income and democracy since t-4 as the explanatory variables, without country

fixed effects

(iii) Log of income and democracy since t-4 as the explanatory variables, with country fixed

effects

(iv) Log of income, democracy since t-4, a dummy variable for the East Asia and Pacific

region and an interaction variable between democracy since t-4 and the East Asia and

Pacific dummy as the explanatory variables, without country fixed effects

(v) Log of income, democracy since t-4, a dummy variable for the East Asia and Pacific

region and an interaction variable between democracy since t-4 and the East Asia and

Pacific dummy as the explanatory variables, with country fixed effects

You should report the number of observations and the adjusted R-squared for each of these

regressions in this table. You should also indicate which regressions include country fixed effects.

Your do file should be able to construct the table in a .txt file.

3. Interpret the coefficients on the variable democracy since t-4 in each of the five regressions

from the table in (2). Explain the changes in the coefficient that you observe from one column to

another. [5 marks]

4. What is the total effect of democracy (as captured by the variable democracy since t-4 variable)

on life expectancy in the East Asia and Pacific region? What is the standard error associated with

this total effect? [3 marks]

5. Construct separate tables for each of the following dependent variables, that includes all five

regression specification that you used in (2). [2 marks]

(i) Infant mortality

(ii) DPT immunization

(iii) Measles immunization

(iv) Access to improved sanitation facilities

6. Do you find these results credible? That is, do you believe they represent a causal effect of

democracy and health outcomes? Why or why not? You are encouraged to present any

evidence that you can to support your arguments. [2 marks]

