DATA ANALYSIS AND MACHINE LEARNING 4 – COURSEWORK 2 RUBRIC
Content: Data Analysis (20% of total mark)
Outstanding 100% The student satisfies the criteria for “Excellent” and goes beyond. Their analysis
is extremely detailed and insightful.
Excellent 80% The student concretely establishes the problem of sentiment analysis in
context, and justifies its importance, citing relevant literature. The “Sentiment
Soup” dataset is analysed in depth and clearly summarised using helpful figures
and tables. Attention and detail will be given to how text samples differ by
source, and by sentiment. To perform this analysis the student will draw upon
data analysis techniques. They will use, and demonstrate an understanding of, a
data analysis technique that wasn’t taught on the course.
Good 60% The student tries to establish the problem of sentiment analysis and why it is
important. This will draw on literature, but this may not be particularly relevant.
The “Sentiment Soup” dataset is analysed in some depth and summarised using
figures and tables. There will be a reasonable attempt at analysing how samples
differ by source, and by sentiment. The student will draw upon data analysis
techniques for this summary. They will not consider techniques beyond those
taught on the course, or will use such techniques without demonstrating full
understanding.
Satisfactory 40% The student provides little context for the problem of sentiment analysis. The
“Sentiment Soup” dataset is analysed at a surface level and summarised using
figures and tables. The student won’t consider how samples differ by source or
by sentiment. There will be some use of data analysis techniques although it
will be obvious to the reader that the student doesn’t fully understand how
these techniques work.
Fail 20% The student provides no context for the problem of sentiment analysis. The
“Sentiment Soup” dataset undergoes very little analysis. The dataset is not
summarised using any figures or tables, and any references to data analysis
techniques is superficial and demonstrates little understanding.
Bad Fail 0% There is little or no content relating to data analysis.
Content: Machine Learning (40% of total mark)
Outstanding 100% The student satisfies the criteria for “Excellent” and goes beyond. Their
experimental setup is flawless, and they have clearly established the extent to
which machine learning can be used for sentiment analysis on “Sentiment
Soup” and have concretely shown how their findings can be deployed
elsewhere.
Excellent 80% The student constructs a set of appropriate, clearly described classification tasks
using “Sentiment Soup”. There is a strict separation of training and test data for
each task with no leakage. The student explores the performance of different
models on these tasks using held-out validation data, or through cross-
validation, before evaluating a final chosen model on test data. The student will
explore the interpretability of their chosen models where possible. The student
considers, and demonstrates an understanding of, a classification model that
was not taught on the course. The student will compare several different vector
representations for text and establish which parts of the classification pipeline
have the largest effect on performance. They will show an awareness of how
different performance criteria may be more suitable for certain tasks. The
student examines the effectiveness of their chosen models on relevant external
data that they have sourced.
Good 60% The student constructs a set of appropriate tasks using the “Sentiment Soup”
dataset which are reasonably well described. The student makes a concerted
effort to separate training and test data but there may be some unintentional
leakage. The student explores the performance of different models on these
tasks using held-out validation data, or through cross-validation. There will be
some consideration of the interpretability of their chosen models. The student
will attempt to use a classification model that was not taught on the course but
there won’t be evidence of clear understanding. The student will consider a few
different vector representations of text but this analysis may not be particularly
in-depth and the student may not identify how important this part of the
classification pipeline is from the model used. They won’t consider performance
criteria beyond accuracy. The student will apply their models to some external
data they have sourced.
Satisfactory 40% The student constructs a set of classification tasks using the “Sentiment Soup”
dataset. Some of these tasks might not be appropriate, and the descriptions of
the tasks may be lacking detail. There is some effort to measure generalisation
by separating training and test data although this won’t be strictly enforced. The
student explores the performance of a few models on these tasks, but
evaluation may be performed directly on the test set, leading to unintentional
overfitting. The student won’t comment on the interpretability of their chosen
models. They won’t examine any representations beyond Bag-of-words, and
won’t consider performance criteria beyond accuracy. The student will not
apply their models to any external data.
Fail 20% The student tries to construct some classification tasks from “Sentiment Soup”
but these are not appropriate. The experiments conducted are substantially
flawed such that there is no meaningful way of measuring the generalisation
performance of any classification model.
Bad Fail 0% There is little or no content relating to machine learning.
Report (20% of total mark)
Outstanding 100% The report satisfies the criteria for “Excellent” and goes beyond. It is
immaculate, and of a publishable standard.
Excellent 80% The report will be easy for the reader to understand. It will be well-written and
use paragraphing. Sentences will be grammatical correct and contain minimal
typos. The report will be tidy and well-formatted. It will be partitioned into
sections with clear titles and begin with an abstract. It will have a clear
narrative. The report will contain high-quality figures and tables with detailed
captions that are clearly referenced in the text. It will have a tidy, consistent
bibliography that is referenced by the main text. The report will be of the
correct length. Overall, it will be aesthetically pleasing.
Good 60% The report will be straightforward for the reader to understand. Writing will be
largely clear, but there may be some sentences that cause confusion.
Paragraphs will be present. There may be minor grammatical errors, or
excessive typos. The report will be largely tidy but may have minor formatting
issues. The report will be partitioned into sections but may be missing an
abstract. There will be a good attempt at forming a narrative. Figures and tables
will be present, although these may be untidy with short captions. There will be
a bibliography although there may be inconsistent formatting between entries.
The report will be of the correct length.
Satisfactory 40% The writing in the report will get across the essence of what the writer is trying
to convey but will only be of passable quality, and unclear in places.
Paragraphing will not be used effectively. Grammatical errors and typos will be
noticeable and occur quite frequently. The report will look quite messy and
have formatting issues. The report won’t have a clear structure with different
sections and there won’t be a clear narrative. Figures and tables will be present
but these will be messy, low quality, and lack captions. There won’t be a
bibliography, and the report may be slightly over- or under-length.
Fail 20% The report will be of a poor quality, and badly written. In many places it won’t
be clear to the reader what the writer is trying to convey. There will be no
paragraphs, and the report will only consist of walls of text. Grammatical errors
and typos will be commonplace. The report will be messy and poorly formatted
and be lacking in any structure. It will be devoid of a narrative. Figures and
tables will not be present, or be of such a quality that they don’t provide any
value. There won’t be a bibliography and the report may be significantly over-
or under-length.
Bad Fail 0% The report looks unfinished and does not convey anything of value.
Code (20% of total mark)
Outstanding 100% The code satisfies the criteria for “Excellent” and goes beyond. It is of the same
standard as code produced by a professional software engineer.
Excellent 80% The code in the appendix will be clear and easy to read. It will not be in the form
of screenshots taken from a Jupyter notebook or IDE. The code will be high
quality, efficient, and will easily generalise to other text datasets. It will adhere
to the PEP 8 style guide for Python code. The code will be well commented. It
will be presented and structured in such a way that it is very easy for the reader
to understand which part of the code produced which figures and results in the
main report.
Good 60% The code in the appendix will be reasonably clear. It will not be in the form of
screenshots taken from a Jupyter notebook or IDE. It will be relatively high
quality but may contain some inefficiencies. It would take a reasonable amount
of work to make the code generalise to other text datasets. The code will
contain comments although some of these may be unhelpful. The code will be
presented and structured in such a way that it doesn’t take too much effort for
the reader to identify which parts of the code produced which figures and
results in the main report.
Satisfactory 40% The code in the appendix will be readable. However, it may be in the form of
screenshots take from a Jupyter notebook or IDE such that it isn’t possible for
the reader to isolate the text. The code will work but will be messy and
inefficient. It would take a lot of work to make the code generalise to other text
datasets. Comments will be missing or too sparse to be of any help. It won’t be
clear how different parts of the code produced the figures and results in the
main report.
Fail 20% The code in the appendix will be difficult to read and may be in the form of a
low-quality screenshot. The code will work but will but be messy and borderline
indecipherable. It would be more beneficial to start from scratch than try to
modify the code to generalise to other text datasets.
Bad Fail 0% There is no code provided, or the code provided does not work.