Deep Learning
COSC 2779/2972 | Semester 2 2022
Assignment 2: Sequence Processing with Deep learning
Assessment Type Individual assignment. Submit online via Canvas → Assignments → Assignment 2. Marks awarded for meeting requirements as closely as possible. Clarifications/updates may be
made via announcements/relevant discussion forums.
Due Date Week 12, Friday 14 October 2022, 05:00pm
Marks 50%
1 Overview
In this assignment you will design and create an end-to-end deep learning system for
a real-world problem. This assignment is designed for you to apply and practice skills
of critical analysis and evaluation to circumstances similar to those found in real-world
problems. This is a Individual project.
In this assignment you will:
• Design and Create an end-to-end deep learning system.
• Analyse and Evaluate the output of the algorithms.
• Research into extending techniques that are taught in class.
• Provide an ultimate judgement of the final trained model(s) that you would use in
a real-world setting.
This assignment has the following deliverables:
1. A report (of no more than 3 pages , plus up to 2 pages for appendices) critically
analysing your approach and ultimate judgement.
2. Your Python scripts, Jupyter notebooks, and software used to build your learning
system and produce the models and results.
2 Learning Outcomes
This assessment relates to all of the learning outcomes of the course which are:
• Discuss and critically analyse a variety of neural network architectures; Evaluate
and Compare approaches and algorithms on the basis of the nature of the problem/task being addressed.
• Synthesise suitable solutions to address particular machine learning problems based
on analysis of the problem and characteristics of the data involved.
• Communicate effectively with a variety of audiences through a range of modes
and media, in particular to: interpret abstract theoretical propositions, choose
methodologies, justify conclusions and defend professional decisions to both IT
and non-IT personnel via technical reports of professional standard and technical
presentations.
• Develop skills for further self-directed learning in the general context of neural networks and machine learning; Research, Discuss, and Use new and novel algorithms
for solving problems; Adapt experience and knowledge to and from other computer
sciences contexts such as artificial intelligence, machine learning, and software design.
3 Assessment details
3.1 Task
Using deep learning in real-world settings involves more than just running a data set
through a particular algorithm. In this assignment, you will design, analyse and evaluate
a complete machine learning system.
The key aspect of this assignment is the design, analysis, and evaluation of your
methodology, investigation, and results. This assignment focuses on both the accuracy
of your model, and your understanding of your approach and model.
For this assignment you have a choice of your project. You may select this project
from the list in Section 4, or you may negotiate a project with the course co-ordinator.
Regardless of the problem you choose, you must conduct the following tasks:
1. Conduct a review to identify the most suitable approaches to solve the problem.
2. Investigate various Deep Learning solutions to the problem.
3. Make an ultimate judgement.
4. Evaluate your ultimate judgement against independent testing data.
5. Produce a report of your design, investigation, evaluation and findings.
4 Suggested Projects
4.1 Detection of Persuasion Techniques in Texts and Images
Warning: This project contains meme examples and wording that might be offensive
to some readers. If you are concerned, please skip this project and attempt the other
option.
Internet and social media have amplified the impact of disinformation campaigns.
Such propaganda campaigns are often carried out using posts spread on social media,
with the aim to reach a very large audience. Some of the most influential posts in social
media use memes where visual cues are being used, along with the text. In this project,
you will use the data set from “SemEval-2021 Task 6: Detection of Persuasion Techniques
in Texts and Images” to develop a deep learning system that can automatically detect
Persuasion Techniques used in memes. There are three sub-tasks you can perform on this
2
data set. However, for this assignment, you only have to complete one of the two
sub-tasks mentioned below to get full marks.
• Subtask 2 (ST2) Given the textual content of a meme, identify which techniques
(out of 20 possible ones) are used in it together with the span(s) of text covered by
each technique. This is a multi-label sequence tagging task.
• Subtask 3 (ST3) Given a meme, identify which techniques (out of 22 possible ones)
are used in the meme, considering both the text and the image. This is a multi-label
classification problem.
For more information on the tasks, refer to the reference of the data set: “Dimitrov, D.,
Ali, B.B., Shaar, S., Alam, F., Silvestri, F., Firooz, H., Nakov, P. and Da San Martino,
G., 2021, August. SemEval-2021 Task 6: Detection of Persuasion Techniques in Texts
and Images. In Proceedings of the 15th International Workshop on Semantic Evaluation
(SemEval-2021) (pp. 70-98).”
1. Paper Link: https://arxiv.org/pdf/2105.09284.pdf
2. Github Link: https://github.com/di-dimitrov/SEMEVAL-2021-task6-corpus
The data set is available on canvas. This data set can be combined with other data
sets that you might obtain from the internet to improve performance.
Licence agreement: The dataset can only be used for the purpose of this assignment.
Sharing or distributing this data or using this data for any other commercial or noncommercial purposes is prohibited.
requirements
• Develop a DL-based solution and demonstrate your knowledge of advanced DL
techniques. Only neural network-based techniques can be used in the assignment.
Other ML techniques such as SVM, RF cannot be used.
• You may use pre-trained networks as part of your solution. However, there needs
to be a “clearly identifiable” network segment(s) that is designed and trained by
you. You should show how this segment is developed (tuned) in your code.
• (Report) You need to come up with a deep learning system, where each element of
the system is justified using data analysis, performance analysis and/or knowledge
from relevant literature.
• (Report) You should clearly explain your evaluation framework, including how you
selected appropriate performance measures, and how you determined the data splits.
• (Report & code) A thorough investigation should be conducted to check the strengths
and weaknesses of your model when applied to real-world data. You should use independent test data to conduct this investigation which may be: collected from the
internet yourself or memes made up by you simulating real scenarios.
4.2 Negotiated Project
You may propose and negotiate a project and machine learning problem to investigate,
with the course co-ordinator. This project must meet a number of constraints:
• Should be suitable for application of deep learning.
3
• The project must be of a suitable complexity and challenge that is similar to the
suggested projects. As part of the negotiation, the scope and deliverables of the
project will be set.
• The data set to be used in the project must be available without restrictions before
the start of the negotiation process.
• The proposed project must be independent of previously or concurrently assessed
work. You may not conduct a project if you have already been assessed on the
work, or are concurrently being assessed on the work.
In general, negotiations will take place via email, during consultation hours, or by
appointment. Please note, that the course co-ordinator is not available outside of business
hours.
All negotiated projects must be finalised by no later than 5pm Thursday Week
9. This is the absolute deadline. If you wish to conduct a negotiated project, begin the
negotiation process early. A negotiated project may be denied before the deadline if
there is insufficient time for the negotiation process.
5 Submission
You have to submit all the relevant material as listed below via Canvas.
1. A report (of no more than 3 pages , plus up to 2 pages for appendices) critically
analysing your approach and ultimate judgement. Should be in PDF format.
2. Your code (Jupyter notebooks) used to perform your analysis. Should be a ZIP
file containing all the support files.
The submission portal on canvas consists of two sub-pages. page one for report
submission and the second page for code and other file submission. More information
is provided on canvas. Include only source code in a zip file containing your name. We
strongly recommend you to attach a README file with instructions on how to run your
application. Make sure that your assignment can run only with the code included in your
zip file! Include a PDF version of your report.
After the due date, you will have 5 business days to submit your assignment as a late
submission. Late submissions will incur a penalty of 10% per day. After these five days,
Canvas will be closed and you will lose ALL the assignment marks.
Assessment declaration:
When you submit work electronically, you agree to the assessment declaration - https://
www.rmit.edu.au/students/student-essentials/assessment-and-exams/assessment/
assessment-declaration
6 Academic integrity and plagiarism (standard warning)
Academic integrity is about honest presentation of your academic work. It means acknowledging the work of others while developing your own insights, knowledge and ideas.
You should take extreme care that you have:
4
• Acknowledged words, data, diagrams, models, frameworks and/or ideas of others
you have quoted (i.e. directly copied), summarised, paraphrased, discussed or mentioned in your assessment through the appropriate referencing methods
• Provided a reference list of the publication details so your reader can locate the
source if necessary. This includes material taken from Internet sites. If you do not
acknowledge the sources of your material, you may be accused of plagiarism because
you have passed off the work and ideas of another person without appropriate
referencing, as if they were your own.
RMIT University treats plagiarism as a very serious offence constituting misconduct.
Plagiarism covers a variety of inappropriate behaviours, including:
• Failure to properly document a source
• Copyright material from the internet or databases
• Collusion between students
For further information on our policies and procedures, please refer to the following:
https://www.rmit.edu.au/students/student-essentials/rights-and-responsibilities/
academic-integrity.
7 Marking guidelines
A detailed rubric is attached on canvas. In summary:
• Approach 50%;
• Ultimate Judgment & Analysis 30%;
• Report & Code 20%;
Approach: You are required to use a suitable deep learning based approach to solve
the problem. Each element of the approach need to be justified using data analysis,
performance analysis and/or published work in literature. This assignment isn’t just
about your code or model, but the thought process behind your work. The elements of
your approach may include:
• Setting up the evaluation framework
• Selecting CNN architecture, loss function and optimization procedure.
• Hyper-parameter setting and tuning
• Identify problem specific issues/properties and solutions
• Demonstrate your skills on advanced concepts in deep learning.
Ultimate Judgement: You must make an ultimate judgement of the “best” model that
you would use and recommend in a real-world setting for this problem. It is up to you to
determine the criteria by which you evaluate your model and determine what is means to
be “the best model”. You need to provide evidence to support your ultimate judgement
5
and discuss limitation of your approach/ultimate model if there are any. You should use
independent test-data to conduct this investigation (may not apply to some negotiated
projects).
Critical Analysis & Report: Finally, you must compile a report describing and
analysing the approach that you have taken to find a suitable model and make your
ultimate judgement. Your report must be no longer that 3 pages , plus an additional
2 pages for appendices. The appendices must only contain references, figure, diagram,
or data tables that provide evidence to support the conclusions and statements in your
report.
Any over length content, or content outside of these requirements will not
be marked. For example, if you report is too long, ONLY the first 3 pages pages of text
will be read and marked.
In this report you should describe elements such as:
• Your final selected approach
• Why you selected this approach
• Parameter settings and other approaches you have tried.
• Limitation and improvements that are required for real-world implantation.
This will allow us to understand your rationale. We encourage you to explore this
problem and not just focus on maximising a single performance metric. By the end of
your report, we should be convinced about your ultimate judgement and that you have
considered all reasonable aspects in investigating this problem.
Remember that good analysis provides factual statements, evidence and justifications
for conclusions that you draw. A statements such as:
“I did xyz because I felt that it was good”
is not analysis. This is an unjustified opinion. Instead, you should aim for statements
such as:
“I did xyz because it is more efficient. It is more efficient because . . . ”