首页 > > 详细

辅导 FIT5147 Data Exploration and Visualisation Semester 1, 2025讲解 Python编程

FIT5147 Data Exploration and Visualisation

Semester 1, 2025

Data Exploration Project

Part 1: Data Exploration Project Proposal

Part 2: Data Exploration Project Report

You are asked to explore and analyse data about a topic of your choice. It is an individual assignment and worth 35% of your total mark for FIT5147. Part 1 Project Proposal contributes 2% and Part 2 Project Report contributes 33%.

Relevant Learning Outcome

●    Perform. exploratory data analysis using a range of visualisation tools.

Overview of the Assessment Tasks

1.    Identify the project topic, some related questions that you want to address, and the data source(s) that you will be using to answer those questions.

2.    Submit your Project Proposal (Part 1) in the Assessments section of Moodle in Week 3.

3.    Discuss with your tutor in your Week 3 Applied Session (after the submission in Moodle) and wait for approval from your tutor before proceeding further. Do not seek approval from the lecturer.

4.    Collect data and wrangle it into a suitable form for analysis using whatever tools you like (e.g., Excel, R, Python).

5.    Explore the data visually to answer your original questions and/or to find other interesting insights    using Tableau or R. The exploration must rely on visualisations and visual analysis, but can analytical methods or statistical analysis where appropriate.

6.    Write a report detailing your findings and the methods that you used. This must include properly captioned figures demonstrating your visual analysis (i.e. your visualisations must be referred to  correctly in your report).

7.    The Project Report (Part 2) is due in Week 7.

Read the rest of this document before deciding on your project topic, as the proposal is for the entire Data Exploration Project and Data Visualisation Project, which is the second major assignment of this unit. See the end of this document for an example proposal and potential data sources to get started. Be careful not to copy this proposal; it is an example proposal, not template text.

Choosing a Topic and Data

The choice of topic, data, and the questions you seek to answer should allow for interesting and detailed analysis in the Data Exploration Project (DEP) and the subsequent Data Visualisation Project (DVP, due at the end of semester), which involves presenting the findings from your DEP in a specifically designed narrative interactive visualisation format.

Good questions are general and not linked to specific parts of the data, allowing for more open-ended and exploratory analysis. For instance, asking “Where is the safest part of the network?” is a good question that lets you explore various interpretations of how to link terms like “where” and “safest” to the data about a network, whereas “Which region has the lowest value of number-of-deaths?” is not a very good question as it is very specific to the data, is easy to answer with one visualisation and therefore limits the exploration and visualisation possibilities.

It is strongly recommended that you avoid questions that are:

●   too easy to answer (e.g., what is the correlation between x and y, what is the average value of z variable, what are the top/bottom N values), or

   too difficult to answer (the work would take longer than the time available in the unit), or

●    not relevant to the unit (e.g., training a machine learning model), or

   are not possible to answer from the available data.

Proposals with such questions will be rejected. If you are in doubt, talk to teaching staff during face-to-face teaching times or ask for confirmation on Ed.

How do you know if you have appropriate data? This depends on your topic and questions. You should ensure your data is big enough, i.e., has enough breadth and depth to invite interesting exploration.

Combining data from different data sources is an ideal way to help add to the originality of the topic. To encourage different visualisation techniques your data will likely have a mixture of different data types.  Time series (whether this be aggregated or detailed, such as months and years, or milliseconds) may be useful for your topic, and spatial, relational or text based data add useful complexity. If in doubt, talk to teaching staff during face-to-face teaching times or in a consultation before the due date.

The chosen topic should be topical and some of the data should be recently collected, ideally from the last two or three years. The data must be accessible to the teaching staff, so the use of open data is encouraged (see the list of suggested data sources at the end of this document). Use of closed or proprietary data is allowed as long as explicit permission for use in this assignment is  granted by the original authors or copyright holders. If you have closed data, you must still make it available to your teaching staff to access, i.e., via a shared Google Drive.

Avoid common topics. Common topics including COVID-19, Netflix, AirBnB, car accidents, crime, house sales, car sales, world cup soccer, or electric vehicle sales should be avoided. Topics similar to the proposal example at the end of this document, i.e., traffic accidents and poor weather, must also be avoided. If you  do have personal motivation for any of these mentioned common topics, you will need to propose a completely new angle to exploring the theme through novel questions with a mixture of new data sources. It is highly recommended to discuss your intentions with the tutor of your Applied Session prior to the proposal submission to avoid immediate rejection of the proposal.

Part 1: Project Proposal (2%)

Write a one-page PDF document consisting of the following sections:

1.    Project Title

A descriptive title for your project.

2.    Topic Introduction

One paragraph introducing the topic. This should include why it is a topical subject (for example, has it been in the news recently), and who might benefit from the insights you seek from your questions.

3.    Motivation

One paragraph describing why you personally are motivated to study this topic.

4.    Questions

Three questions you wish to answer using the data.

5.    Data source(s)

Briefly describe the data source(s) you will use. This should include: URLs of data source(s) and a description for each source: what is the data about, what is the size of the data (e.g., number of rows, number of columns), the type of data (e.g., tabular, spatial, relational, or textual), the type of attributes (e.g., categorical, ordinal, etc.) and the temporal intervals and period (e.g., monthly between 2019 and 2023).

6.    References

The bibliographical details of any references you have cited in the previous sections.

Include your full name, student ID, tutor names, and Applied Session class number. This can be in the document header or footer. There should be no cover page.

Part 2: Data Exploration (33%)

The report should have the following structure:

1.    Introduction

Topic detail, problem description, questions, and brief motivation.

2.    Data Wrangling and Checking

Description of the data and data sources with URLs of the data, the steps in data wrangling (including data cleaning and data transformations) and tools that you used. The data checking that you performed, errors that you found, your method and justification for how you corrected errors, and the tools that you used. A comprehensive checking process is expected to justify data correctness, even if the data set is believed to be clean.

3.    Data Exploration

Description of the data exploration process with details of the visualisations (including figures and descriptions of findings) and statistical tests (if applicable) you used, what you discovered, and what tools you used.

4.    Conclusion

Summary of what you learned from the data and how your data exploration process answered (or didn’t answer) your original questions.

5.    Reflection

Brief description of what lessons you learnt in this project and what you might have done differently in hindsight.

6.    Bibliography

Appropriate references and bibliography (this includes acknowledgements to online references or sources that have influenced your exploration) using either the APA or IEEE referencing system.

Include your full name, student ID, tutor names, and Applied Session class number. This may be on a cover page, or in the header or footer of the first page.

The written report should be not longer than 10 pages for all sections mentioned above, excluding cover page, table of contents and appendix. Your written report will be the sole basis for judging the quality of the data checking, data wrangling, data exploration, as well as the degree of difficulty. Thus, include sufficient information in the report. It should, for instance, contain images of visualisations used for exploration and the results of any statistical analysis. You should include any analysis that you carry out even if it is incomplete or inconclusive as it demonstrates that you have thoroughly explored the data set.


If you wish to provide additional material, an Appendix of up to 5 pages may be added at the end of the document. However, the Appendix will not be marked. Therefore, you should only use it to provide supplementary material that is not essential to the report or the reader's understanding. Be sure to clearly title this section as Appendix.



联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!