COMP5310 Project Stage 2B
Develop and Evaluate Predictive Model
Due: 11:59pm on 14th of May 2023 (end of Week 11)
Value: 15% of the unit
This stage is usually done with the same group members as you worked with for Stage 2A.
However, under exceptional circumstances an alternative group may be created by the unit
coordinator when a group is reduced in size due to member discontinuing this unit. If this
applies to you, please urgently email Nazanin.borhan@sydney.edu.au to discuss this.
DISPUTE RESOLUTION
If, during the course of the assignment work, there is a dispute among group members that
you can’t resolve, or that will impact your group’s capacity to complete the task well, you
need to inform the unit coordinator, Nazanin.borhan@sydney.edu.au. Make sure that your
email includes your group number and tutorial session, and is explicit about the difficulty.
Also, make sure this email is copied to your tutor and all the members of the group (including
anyone you are complaining about). We need to know about problems in time to help fix
them and deal with non-performance promptly (don’t wait until a few days before the work
is due to complain that someone is not delivering on their tasks). If necessary, the unit
coordinator will split a group, and leave anyone who didn’t participate effectively in a group
by themselves (they will need to achieve all the outcomes on their own). This option is only
available up until Monday May 1st, which is the last day with time to resolve the issue before
the due date. For any group issues that arise after this time, you will need to try to resolve
the problem on your own, and you will continue to be treated as a single group. If someone
doesn’t provide the material required for the report, or their material is not of the agreed
standard, you should still have the report show what that person did. Their section of the
report may be empty if they don’t produce anything, or it may have material but not enough.
In such cases, please put a “Note to marker” on the front page of the report, which describes
the circumstances. That way, we can consider how best to apply the marking scheme. Note
that it is not expected or sensible for other members to do the work that someone failed to
deliver.
TASKS
GROUP TASKS:
1. Identify an attribute that you will all make predictions about and find a dataset that
contains this attribute. The attribute you are predicting may be quantitative or nominal.
The dataset may be one from the previous stages of this project.
2. Decide on the measure of success for the predictive models you will be producing. You
will need to justify your choice of measure and describe its strengths and limitations.
3. Divide the dataset into a training set and a test set. We suggest having at least one-
tenth of the original dataset in the test dataset.
4. Coordinate in choosing the methods you will use, to each produce a predictive model for
this attribute, using the training dataset (the coordination is needed to avoid duplication
between members, and to enable a good conclusion for your report).
5. Write Part B of the report, that discusses the different models and their strengths and
weaknesses. This should be written for a reader who is interested in your research or
business question.
Page 2 of 5
Note: The models created in this Stage must ALL be predicting (in different ways) one
common attribute in the one common dataset. You are allowed to use a dataset you already
have from Stage 1 or 2A, but you are equally free to change dataset and even domain, however,
keep in mind that many machine learning techniques do not work well unless the dataset is
large enough and quite clean. We recommend that you do some preliminary data analysis to
convince yourself that there is some relationship between the other attributes and the one you
are going to predict (otherwise predictions will not be very effective). You also need to choose
how you will measure the effectiveness of predicting. We recommend that you use one of the
measures that is built-in for scikit-learn to calculate, given the test data and the predictions
made for those items. For higher levels than pass, you need more than one measure that you
will calculate on each model.
INDIVIDUAL TASKS:
1. Use Python (for example, the scikit-learn library) to produce a predictive model for the
chosen attribute from the training dataset, using the kind of model and training method
allocated to you by the group. If your method for training has hyper-parameters, you
should adjust them as well as possible, but only using parts of the training dataset in
doing so (you must not use any of the test dataset for this).
2. Evaluate the quality of the predictive model you produced, in terms of the measure of
success that the group chose.
3. Write your section in Part A of the report, in which you present the work you have done
individually.
WHAT TO SUBMIT
There are TWO deliverables in this stage of the project, and both should be submitted by
ONE PERSON on behalf of the whole group.
1. A written report on your work, as a PDF document. There is a maximum length for the
report of 2500 words for groups of 2 and 3000 words for groups of 3. The report
should have a front page, that gives the group name and lists the members involved
(giving their SID and unikey, not their name), and then the body of the report has a
structure as follows (this corresponds to the marking scheme):
Part A: It should be targeted at a tutor or lecturer whose goal is to see what you
achieved, so they can allocate a mark. In this section you must:
a. State your research or business question.
b. State the domain and the dataset you are using.
c. Indicate how you split your dataset into training and test data.
d. Then, there should be one section for each member (the section should state the
SID/unikey of the group member who did the work reported in this section). In
this section, there should be the following sub-sections:
o A description of the way you produced the predictive model, including the
Python code you wrote that produces the model and any pre-processing
(e.g., rescaling some attributes). If possible, you should also give the
predictive model itself (e.g., for a linear regression, you would report what
coefficients each attribute has in the model; for a decision tree you would
state the different decision points).
o The evaluation of how well your predictive model does in predicting. This
must include the Python code you wrote that calculates some measure of
effectiveness (on the test data), as well as stating the actual value of this
measure for your predictive model. For higher marks, textual discussion
is also needed (see the mark scheme below). For example, you may
consider using significance testing, confidence intervals, regression r-
Page 3 of 5
square, clustering V-measure, classification f1-score, etc.
Part B: Targeted at someone who is interested in your research or business question,
and wants to understand how well various machine learning approaches work for
producing predictive models in the context of your research or business question. This
part is written as a group, and you must:
a. Describe the different ways the members produced predictive members.
b. Comment on the evaluations to draw conclusions about the strengths and
limitations of the different approaches, tying this back to your business question
(see the marking scheme for more guidance on what is expected here).
2. The code and dataset you used to produce your predictive model and calculate some
measure of effectiveness of the model. If you have done any further transforms on
attributes before training/testing, this code should also be included. The code should be
submitted as a single zip or tar.gz file which contains a subfolder for each group
member.
MARKING
Here is the mark scheme for this assignment. The score (out of five) is the sum of separate
scores for each of three components. Note that there is an individual and a group component
to each member’s mark.
Predictive Models [3 points] [Individual Mark]
This component is assessed based on the corresponding subsection of the separate member
section in Part A of the report; the uploaded data and code may be checked by the marker as
supporting evidence for claims made in the report.
[Full marks]: The Distinction criteria holds, and also there is a clear explanation of any
method that is not presented in the tutorials, including an argument for why this is a
reasonable approach to consider for the task (this discussion should go well beyond simply
reporting that the model predicts well, to argue that one could reasonably hope that it might
be good, in several ways).
[Distinction]: The Pass criteria holds, and also at least one of the methods used must go
beyond what is covered in the tutorials.
[Pass]: The group member uses Python and the agreed training dataset and correctly
produces a predictive model for the agreed attribute. The code that each member wrote to
produce their model (including doing any preliminary attribute transformations) must be
explicitly shown in the report. The ways in which the various members’ models are produced
should all be different from one another (this could be different algorithmic training
techniques, different choice of hyper-parameters, different scaling, or choice of input
attributes, etc.).
[Flawed]: Some predictive model is produced using Python.
Evaluation of Predictive Models [4 points] [Individual Mark]
This component is assessed based on the corresponding sub-section of the separate member
section in Part A of the report. The uploaded data and code may be checked by the marker as
supporting evidence for claims made in the report.
Page 4 of 5
[Full marks]: The Distinction criteria holds, and also, for each approach, there is a reasonable
discussion relating the outcome of the measurements to the nature of the training approach,
characteristics of the dataset and any transformations done.
[Distinction]: The group member has correctly reported on more than one measure of
performance of the model on the test dataset. The code that does this measurement must be
explicitly shown in the report. Also, for each approach there is a sensible discussion of the
interpretation of the measurements (for example, whether it is indicating overfitting or
underfitting, whether the accuracy/precision/recall/F1 score differs between different
classes in your data).
[Pass]: The group member has correctly reported on some measure of performance of the
model on the test dataset. The code that does this measurement must be explicitly shown in
the report. The ways in which the various members’ models are produced should all be
different from one another (this could be different algorithmic training techniques, different
choice of hyper-parameters, different scaling or choice of input attributes, etc).
[Flawed]: Some reasonable attempt to evaluate the effectiveness of a predictive model.
Discussion [7 points] [Group Mark]
This component is assessed based on Part B of the report. Material in Part A, or the submitted
data and code may be checked by the marker as supporting evidence for claims made in this
part of the report.
[Full marks]: The Discussion section meets the Distinction criteria and suggests at least one
reasonable improvement that can be made to each member’s predictive model. The structure
needs to be logical and well-organised.
[Distinction]: The Discussion section provides some accurate and clear information about the
different machine learning methods that were used for this task, and provides useful insight
into strengths and weaknesses of the different machine learning methods for answering the
business or research question. It also indicates features of the dataset that impact on the
outcomes. It also discusses honestly and with insight, the strengths, limitations and
uncertainties about the comparisons made between different machine learning techniques
(for example, what are strengths and limitations of the measurements which were used).
[Pass]: The Discussion section provides some accurate and clear information about the
machine learning techniques that were used for this task, and how the resulting predictive
models performed.
[Flawed]: The Discussion section describes the machine learning techniques that were used.
Conclusion [1 point] [Group Mark]
This component is assessed based on Part B (group component) of the report. Material in
Part A, or the submitted data and code, may be checked by the marker as supporting evidence
for claims made in the report.
[Full marks]: The Conclusion section meets the Distinction criteria and makes reasonable
suggestions for future work on your analysis and predictive models that can help achieve the
recommended course of action.
[Distinction]: In addition to the Pass criteria, the Conclusion section describes the extent of
Page 5 of 5
support for this course of action, based on the information in the Discussion section,
identifying what risks, limitations and caveats apply.
[Pass]: The Conclusion section describes a recommended course of action in relation to your
research or business question, that is supported by the information in the Discussion section.
[Flawed]: The Conclusion section describes a recommended course of action in relation to
your research or business question.
Penalties
10% of the overall mark will be deducted if your report is unnecessarily longwinded and
does not address the marking criteria within the word limits.
Late Work
As announced in the unit outline, late work (without approved special consideration or other
arrangements) suffers a penalty of 5% of the maximum marks, for each calendar day after
the due date. No late work will be accepted more than 10 calendar days after the due date.