Data Analytics for Business Capstone
Semester 1, 2023
Assignment 1 (individual assignment)
1. Key information
Required submissions:
Written report (in pdf, due date: Monday, March 27 by the end of the day).
Confidentiality Deep Poll online form (deadline for submission: March 13).
Submission instructions for the report will be posted on Canvas in Week 5.
Weight: 30% of your final grade.
Length: Your written report should have a maximum of 12 pages (single spaced, 11pt). Cover
page, references, and appendix (if any) will not count towards the page limit. Please keep in
mind that making good use of your audience’s time is an essential business skill: every
sentence, table or figure should serve a purpose.
2. Problem description
Please start by reading through the Project Outline document for your industry project, which
you can find on the “Learn about our industry projects page” in the Week 1 module on Canvas.
Focus on the Problem Description section of the Project Outline, especially the first and
the third bullet points (EDA and Strategy), which are the most relevant bullet points for
Assignment 1. Both your analysis and your recommendations should be in line with the
requirements/suggestions provided in the Project Outline.
As a business analyst, you will conduct Exploratory Data Analysis (EDA) of the data
corresponding to your industry project. You should aim to find or reveal all relevant properties,
characteristics, patterns, and statistics hidden in the data, supporting your findings with
insightful plots and relevant statistical output.
Use the results from your EDA to outline a preliminary strategy or provide preliminary
recommendations to the management team corresponding to your selected industry project.
You will have a chance to refine these recommendations in Assignment 2. Please refrain from
extensive modelling and model selection – you will do them in Assignment 2. However, feel
free to fit simple models (e.g., linear regression or logistic regression) for the purposes of EDA
and understanding the relationships among the variables in the dataset.
BUSINESS SCHOOL
Page 2 of 4
3. Written report
The purpose of the report is to describe, explain, and justify your findings to the management
team corresponding to your selected industry project. You may assume that team members
have training in business analytics, however, they are not experts in statistics or machine
learning. The team’s time is important: please be concise and objective.
Suggested outline for the main parts of the report (further details below):
1. Problem formulation.
2. Data processing.
3. Exploratory Data Analysis (EDA).
4. Conclusions and preliminary recommendations.
You should consider breaking down the longer parts into smaller sections.
4. Marking Scheme
Business context and problem formulation. 5 marks
Data processing. 30 marks
Exploratory Data Analysis (EDA). 45 marks
Conclusions and preliminary recommendations. 10 marks
Writing and presentation of the report. 10 marks
Total 100 marks
5. Rubric (basic requirements)
Business context and problem formulation. Your report gives a detailed description of the
problem that is being investigated, providing the context and background for the analysis.
Data processing. You describe the data processing steps clearly and in sufficient detail,
justifying and explaining your choices and decisions. You handle missing values and other
data issues appropriately. You describe and explain your data transformations and/or your
feature engineering process (if any). Your choices and decisions are justified by data analysis,
domain knowledge, logic, and trial and error (if necessary).
Exploratory data analysis (EDA). Your report provides a comprehensive description of your
EDA process, presenting selected results. Your analysis is sufficiently rich, and your
visualizations are insightful. You study key variables and relationships among them using
appropriate plots and descriptive statistics. You note any features of the data that may be
relevant for model building in Assignment 2. You note the presence of outliers and any other
anomalies that can affect the analysis. You explain the relevance of the EDA results to the
underlying business problem and your subsequent recommendations. You clearly describe
and justify the methods in your analysis. The choice of methods is logically related to the
substantive problem, underlying theoretical knowledge, and data analysis. You interpret the
statistical outputs that you provide. You report crucial assumptions and whether they are
potentially violated.
BUSINESS SCHOOL
Page 3 of 4
Conclusions and recommendations. The reasoning from the analysis and results to your
conclusions and recommendations is logical and convincing. Your conclusions and
recommendations are written in plain language appropriate for non-technical audience.
Writing. Your writing is concise, clear, precise, and free of grammatical and spelling errors.
You use appropriate technical terminology. Your paragraphs and sentences follow a clear logic
and are well connected. If you use an abbreviation or label, you define it first.
Report layout. Your report is well organised and professionally presented, as if it had been
prepared for a client later in your career. There are clear divisions between sections and
paragraphs.
Tables. Your tables are appropriately formatted and have a clear layout. The tables have
informative row and column labels. The tables are relatively easy to understand on their own.
The tables do not contain information which is irrelevant to the discussion in your report. The
tables are placed near the relevant discussion in your report. There is no text around your
tables, and your tables are not images.
Figures (plots). Your figures are easy to understand and have informative titles, captions,
labels, and legends. The figures are well formatted and laid out. The figures are placed near
the relevant discussion in your report. Your figures have appropriate definition and quality.
There is no text around your figures, and your figures are not screenshots.
Numbers. All numerical results are reported to suitable precision (typically no more than three
decimal places, in some cases fewer).
Referencing. You follow the University of Sydney referencing rules and guidelines.
Python code. The text of your report should be entirely free of Python code.
Note: you are strongly encouraged to use Python for all the steps of your data analysis. While
there is no Python code submission for Assignment 1, you should keep your code well-
organized, so that you can easily extend/modify/reuse this code for the purposes of
Assignment 2 (which will have a Python code submission requirement).
6. Deductions
Marks may be deducted from each item in the marking scheme in the following cases:
The report is disorganised and/or has a poor layout.
There is an excess of abbreviations or labels that the reader may be unfamiliar with.
The report has an excessive number of grammatical or spelling mistakes.
The tables are difficult to read, for example, due to poor layout or labelling.
The figures are difficult to read, for example, due to poor layout or labelling.
Numbers are not appropriately rounded.
BUSINESS SCHOOL
Page 4 of 4
7. Late Submission of the report
Late submissions are subject to a deduction of 5% of the maximum mark for each calendar
day after the due date. After ten calendar days late, a mark of zero will be awarded.
8. Late submission of the Confidentiality Deed Poll online form
It is a requirement of our QBUS6600 unit that all students complete the Confidentiality Deed
Poll online form before gaining access to the datasets for the industry projects. The datasets
are highly confidential, and you have responsibility to keep them secure and only use them
for your QBUS6600 coursework. Submission of the Confidentiality Deed Poll online form
after the March 13 deadline is subject to a penalty of 20 points for Assignment 1.
Furthermore, assignments without a submission of the online form will not be marked.