FE5209 Project Guideline (AY25/26)
The aim of this project is to analyze (1) the distribution and dependence structures of real- world data, and (2) to conduct regression analysis.
You are free to choose any dataset, as long as it allows for insightful analysis. There is no page limit.
Important Dates
1) STATA/MATLAB session
- On Nov 4 (during lecture hours)
- Feel free to contact Xiaochen ([email protected]) for any technical difficulty regarding codes.
- You may use any software, but we will cover STATA and MATLAB codes during the computation session.
2) Deadline for the report: Nov 29 (Tuesday)
A hard copy of the individual report should be submitted by Nov 29 (the day of the final exam). Please bring it with you and submit it on the desk in front of the classroom right before the exam starts.
What to Submit?
The report must include two sessions.
Session 1: Distribution and Dependence Analysis
Focus on the estimation and inference of distributions and dependence structures.
• You may choose two variables for your analysis (to simplify things).
• If you use time-series data, please ensure stationarity.
• If your dataset includes more than two variables (e.g., stock returns A/B/C), you may present pairwise estimation results.
[Example Data Sets]
• Parental income and child’s income
• Husband’s income and wife’s income (assortative mating)
• Dependence between stock price and trading volume in the Chinese stock market
• Income and consumption
• Exchange rate stability, capital account openness, and monetary policy independence
[Your analysis should include]
• Description of the dataset, source, and variable definitions
• Summary statistics (mean, variance, correlation, covariance estimates etc)
• Distribution test (for marginal distributions) e.g., normality test
• Plot of univariate empirical distributions
• Scatter plot of the integral transformed data
• Copula estimation results (MLE with parametric, semiparametric, or nonparametric margins and copula), including AIC, BIC, and likelihood values
• Your own interpretation and analysis
• Include a concise review of relevant literature and references
Session 2: Regression Analysis
Focus on regression modeling and inference using your dataset. You may use the same dataset as in Session 1 or choose a different one.
[Example Data Sets]
• The effect of class size or school resources on student performance (education policy)
• The effect of minimum wage laws on youth employment (labour economics / policy evaluation)
• The effect of cash transfers or microcredit on household consumption (development economics)
• The effect of interest rates, household income, and unemployment rate on housing prices
• The effect of GDP growth, inflation, and monetary policy indicators on stock returns
[Your analysis should include]
• Description of the dataset, source, and variable definitions (if different from Session 1)
• Summary statistics of the variables used in the regression
• Estimation of regression models (linear, nonlinear, instrumental variable (IV) etc.)
• Interpretation of coefficients (magnitude, sign, and statistical significance)
• Your own interpretation and conclusion
• Literature review and references in proper format
◆ Additional Requirements:
In addition to the required components listed above, you are encouraged to be as creative as possible. You may incorporate:
Methods learned in this course
Methods learned in other courses
Any creative approaches you would like to try
Especially, newly developed or innovative methodologies of your own
The more you attempt, the higher your score will be. (Note: You will not be penalized even if an attempt does not produce meaningful results.)
Please focus on documenting the wide range of methods you tried, rather than on detailed interpretation of results (according to the school’s evaluation policy, the emphasis will be placed on demonstrating the range of attempts, rather than on detailed interpretation of results, since interpretation can be easily automated by AI).