Introduction to Data Analytics
Fall 2024
Final Project: Requirements.
General Instructions:
• The project composes 20% of your final grade: 5% project proposal, 5% project presenta- tion in-class with update on status on most of the project, and a 10% written summary and Python code.
• Add evidence to every claim that you make via tables/plots using Python. Best practice is to add Python plots with clear axis description and legend. Place plots that interrupt the reading flow of your report into an Appendix. Your report can be a single Google Colab notebook with Markdown (explanation of code and findings).
• Feel free to write e-mails, setup appointments, and use any other communication services, in case you get stuck.
• Important dates: project proposal - November 13 11:59PM. I will hold office hours during the last hour of November 20 (at 4:30PM) instead of a tutorial and talk to the 9 groups about their proposals; final presentation date is on November 30 at 12PM (DB0006); final report submission due on December 3 at 11:59PM.
Part 1: Project Proposal. (5%)
a. Load the assigned dataset into Python. Explore it and provide a one-page project proposal. The structure of the proposal (short document + appendices):
1. Introduction to the data (whatever can be found online).
2. Data description, including attributes/features.
3. Interesting phenomena found from a basic analysis, if any (in words).
4. Project questions/analyses (at least 2 of them) that you came up with while exploring the data.
5. Appendices - Python plots and outputs that justify your proposal (unlimited number of pages for the appendix).
b. Send the proposal (+appendices) to the instructor and await a response for approval and/or alternations (by Nov 23 we must finalize the project).
c. Once the proposal is approved (after iterations with the instructor), you are free to start the project - you will have only one week until presentation and two weeks until submission. Stay on schedule.
Part 2: Documenting the Project. (10%)
a. Use Python as your programming tool to answer the approved research questions. If you find new interesting phenomena / predict new response variables, you are not limited to the approved project proposal and in fact, you can substitute an approved analysis with an alternative one (write an email to the instructor).
b. Once you are done answering the questions, write a project summarizing report, with the following structure (use project proposal for the first four parts):
1. Introduction.
2. Data description.
3. Exploratory data analysis.
4. Formulate one classification question that seems to be relevant for the dataset.
5. Techniques used to answer question.
6. The analysis itself (the core part of the project) - including Python plots/output as evidence for the presented results.
7. Discussion.
8. Self assessment: every student in the group will state their individual contribution to the project. Students who contributed less than their peers will receive a lower mark.
9. Appendix - for things that are important, but can disturb the fluent reading of the summary.
c. Most of the project, but not all of it, should be ready prior to presentation day.
Part 3: Project Presentation. (5%.)
• Projects will be presented in class.
• The duration time per presentation will be 20 minutes, approximately 4 minutes per group member.
• All group members MUST attend and speak.
• Every group member must present an equal part and the assessment for the talks will be on an individual basis (grades will be differentiated).
• Presentations should include but will not be limited to: motivation, data description, exploratory data analysis, research/application questions, techniques used, results and a discussion.
• All group members must present their part (even those who may typically not attend class); members that do not attend will receive 0 for the presentation component.