首页 >
> 详细

ELEC2103/9103: Simulation and Numerical Solutions in

Engineering

School of Electrical and Information Engineering, The University of Sydney

Assignment Description

Modelling, predicting, and verifying the accuracy of models are vital skills in engineering and other fields.

This assignment will assess your ability to develop and validate statistical and machine learning models using

MATLAB, and in particular, the Statistics and Machine Learning Toolbox.

1 Key information

For this assignment, you need to complete two steps:

1. Complete the MATLAB Machine Learning Onramp and upload your certificate to the assignment

box. This is an individual assignment and each person needs to complete the training individually. This

is worth 5% of your total mark1

.

2. Perform statistical analysis and machine learning on the given dataset, write a report and submit it to

the assignment box. This is group work and you will work with the same group member as your lab to

complete it. You only need to submit one report as part of the group. This is worth 20% of your total

mark.

2 Background

This assignment asks you to explore and analyse a publicly available data set of your choice from the following

list.

1. Ausgrid distribution zone substation data: Ausgrid operates a network with over 180 zone substations.

These substations form the boundary between the sub-transmission network and the distribution (11kV)

network. Ausgrid is making available historical interval demand data (in Megawatts) for all zone

substations not subject to third party privacy concerns.

The dataset is available here:

https://www.ausgrid.com.au/Industry/Our-Research/Data-to-share/Distribution-zone-substation-data

2. World Bank Education Statistics: The World Bank EdStats Query holds around 2,500 internationally

comparable education indicators for access, progression, completion, literacy, teachers, population, and

expenditures. The indicators cover the education cycle from pre-primary to tertiary education. The

query also holds learning outcome data from international learning assessments (PISA, TIMSS, etc.),

equity data from household surveys, and projection data to 2050.

The dataset is available here:

https://databank.worldbank.org/source/education-statistics-%5e-all-indicators

1Please follow this link to access the MATLAB Machine Learning Onramp course: https://matlabacademy.mathworks.com/

details/machine-learning-onramp/machinelearning

1

Assignment Description ELEC2103/ELEC9103

3. World Bank Health Nutrition and Population Statistics: World Bank key health, nutrition and

population statistics gathered from a variety of international sources.

The dataset is available here:

https://databank.worldbank.org/source/health-nutrition-and-population-statistics

4. Australian Bureau of Statistics: Causes of Death, Australia Statistics on the number of deaths, by

sex, selected age groups, and cause of death classified to the International Classification of Diseases

(ICD).

The data set is available here:

https://www.abs.gov.au/statistics/health/causes-death/causes-death-australia/2020#data-download

5. Kaggle: Retail Analysis with Walmart Sales Data Historical sales data for 45 Walmart stores

located in different regions are available. There are certain events and holidays which impact sales

on each day. The business is facing a challenge due to unforeseen demands and runs out of stock

some times, due to inappropriate machine learning algorithm. Walmart would like to predict the

sales and demand accurately. An ideal ML algorithm will predict demand accurately and ingest

factors like economic conditions including CPI, Unemployment Index, etc.

The dataset is available here:

https://www.kaggle.com/rutuspatel/retail-analysis-with-walmart-sales-data

3 The assignment task

You are to explore and analyse some or all of the data files in one of the datasets above. You are to complete

your analysis using MATLAB, and present your analysis as a report contained in a script and other files that

can be published to a report in html using MATLAB’s Publish features. You are encouraged to share ideas,

but your submitted assignment must be uniquely your own.

3.1 The data

The data is mostly contained in csv files and might be separated for each financial year or month. You need to

fully understand the attributes in each dataset and be able to explain them in your report. You can also draw

on other data sources to inform your analysis (see the section on higher grades below). If you have something

particular in mind, I can advise you of whether it is freely available and where to find it, but the Australian

Bureau of Statistics (ABS) or the Bureau of Meteorology (BOM) are good places to consider.

3.2 Submission requirements

Your assignment will be submitted via Canvas in the form of a .zip file named in the following format:

Group_Group Number.zip.

Your .zip file must contain:

1. Your main file, called called elec2103a.m (regardless of if you are undergrad or postgrad). You are

provided with a MATLAB script stub to get you started, which is available on Canvas.

2. Any custom functions that you write.

3. The data that is needed to complete your analysis.

4. A PDF file including your answer to Part 1, the published version of your main file (Part 2), and your

answers to Part 3.

4 Assignment criteria and grades

The assignment will be given a grade out of 20. Marks will be allocated in three parts, as follows:

Page 2

Assignment Description ELEC2103/ELEC9103

4.1 Part 1 (Total: 4 marks)

Here, you need to clearly explain the dataset and the problem that you want to solve.

1. Problem Statement and Background (2 marks): A high-level statement of the problem you

intend to address/business case study. Give a clear and complete statement of the problem.

2. Resources (2 marks): Where do the data come from, and what are their characteristics?

(a) The data source(s), and

(b) characteristics of the data you intend to use (eg. attributes, data types, etc.)

Marks for part 1: Completing this part reasonably well will earn you 4 marks. Here, “reasonably”

means more than copying and pasting information already available on the websites you are downloading

the dataset from. You need to provide detailed evidence that you have understood the dataset you

are working on and the problem you are going to address.

4.2 Part 2 (Total: 6 marks)

The minimum requirements of this assignment are to:

1. Write a sub-routine to load some or all of the data (from one of the datasets) into a useable format in

MATLAB.

2. Write a sub-routine to analyse the data in MATLAB by modelling/fitting it, using regression, classification,

ANOVA or other machine learning methods. You may wish to pre-process the data in order to extract

some interesting values or variables of merit. Briefly explain your model.

3. Write a sub-routine that makes some assessment of the statistical errors or goodness-of-fit of your model

and returns or prints them in your report. Explain these figures.

4. Make appropriate use of plots and/or charts in your report.

5. In your main script, include a call to at least one custom function that you have written in a separate

m-file.

6. Put your analysis in a publishable MATLAB script that runs without errors. Build on the provided m-file

stub elec2103a.m.

Marks for part 2: Satisfying each of the minimum requirements 1-6 above will earn you 1 mark each.

“Satisfying” means more than joining two points at different times with a straight line, and will

be satisfied if you make proper use of a tool in the Statistics and Machine Learning Toolbox. In

other words, you need to provide evidence that you have learnt how to use some new MATLAB tools.

4.3 Part 3 (Total: 10 marks)

To earn higher grades, you need to complete Part 2 extremely well (4 marks), and

include one advanced form of statistical analysis and/or prediction, performed on the same data

set, with justification for your choice (6 marks).

You may consider one of the following advanced sub-routines:

1. Make a prediction using your model, perhaps into the future (for time series data) or across a new subset.

Discuss your prediction, including making an assessment of the reliability of the prediction.

2. Complete a formal statistical comparison of more than one model or method of analysis.

3. Make use of advanced statistical analysis testing the assumptions of your modelling choice, such as tests

of heteroscedasticity, multicollinearity, etc.

Page 3

Assignment Description ELEC2103/ELEC9103

4. Bootstrapping, jackknifing, k-folds or some other resampling-based validation of the predictive ability of

your model.

5. Sophisticated use of more than one data set (i.e. incorporating additional data beyond the dataset you

chose in your Part 2 analysis).

6. Use of an advanced statistical estimation or machine learning technique, with justification. This could

include:

(a) Using MATLAB’s neural network tools (which doesn’t take much effort)

(b) Estimating a stochastic volatility or hidden Markov model;

(c) Using Bayesian models;

(d) Advanced clustering and/or hierarchical analysis;

(e) If you have an interest in signal processing, you could investigate non-parametric kernel estimators

(akin to kernel smoothing techniques), principle component analysis, or apply a series of bandpass

filters over a time series and see what you get.

Page 4

Marks for part 3: You will need to include detailed discussions and justifications to obtain full marks.

4.4 Assignment length

There are no minimum or maximum lengths to the submission, but treat this like you are trying to

convince a busy person that you have something important to say. Being terse and direct is not a bad thing in

engineering and business communication.

4.5 Late submission penalties

Late assignments will be penalised by deducting 2 marks and by reducing the maximum grade achievable by 2 marks for

each 24 hours overdue, including weekends. Don’t be late!

5 Useful Resources

Tutorials and resources available online to learn how to use the MATLAB Machine Learning toolbox.

1. Introducing Machine Learning

https://www.mathworks.com/content/dam/mathworks/ebook/gated/machine-Learning-ebook.pdf

2. MATLAB for Machine Learning

https://au.mathworks.com/solutions/machine-learning.html

3. Mastering Machine Learning: A Step-by-Step Guide with MATLAB,

https://au.mathworks.com/content/dam/mathworks/ebook/gated/machine-learning-workflow-ebook.

pdf

4. Applied Machine Learning

https://au.mathworks.com/videos/series/applied-machine-learning.html

5. What Is Deep Learning? 3 things you need to know

https://au.mathworks.com/discovery/deep-learning.html

6. Predictive Analytics: 3 Things You Need to Know

https://au.mathworks.com/discovery/predictive-analytics.html

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-21:00
- 微信：codinghelp

- 辅导 program、讲解 python编程... 2024-02-19
- 辅导 cs2910、讲解cs2910 asse... 2024-02-19
- 讲解 cs 532、cs 532: homewor... 2024-02-19
- 讲解business decision analyt... 2024-02-18
- 辅导data structures project... 2024-02-18
- 辅导 hw2: shared memory part... 2024-02-18
- 辅导 econ 323、econ 323: eco... 2024-02-17
- b31se编程讲解 、image proces... 2024-02-17
- 辅导 discrete event systems、... 2024-02-16
- 辅导 ece438、讲解ece438: com... 2024-02-16
- 讲解 program、spatial networ... 2024-02-16
- a03.firstgit编程辅导 、pytho... 2024-02-16
- 辅导 cs9053、讲解introductio... 2024-02-15
- 辅导 comp26020、讲解comp2602... 2024-02-15
- 讲解 csci3280、辅导 introdu... 2024-02-14
- 讲解 consider the following ... 2024-02-14
- 辅导 ems5730、讲解homework #... 2024-02-14
- 辅导 cs 211编程、讲解compute... 2024-02-13
- 辅导assignment 1 – business... 2024-02-13
- prog10065讲解 、辅导interact... 2024-02-13