COMP3425/COMP8410 Data Mining, Semester 1 2023
20 February to 26 May 2022 On-campus mode (in-person, synchronous).
This course is an introduction to data mining and the broad skills for selecting and applying data
mining algorithms. We cover a breadth of common and emerging techniques from statistics and
computer science with the aim of understanding where they might be useful and how to use them
properly, while understanding limitations. There is an emphasis in the course on data-driven
practical work over the mathematical and statistical foundations, although a conceptual
understanding of methods is expected. Detailed course content will be made available during the
course at the Wattle site.
Quick Reference
Mode of Delivery On-campus: 12 weeks plus exam. Course material is presented on-
line, supplemented with a 2-hour (real-time online and recorded)
lecture plus a 1.5-hour (in-person) lab class in most weeks. A small
number of students may attend an online lab class for which
evidence of inability to attend in-person classes may be required.
Prerequisites COMP8410: (COMP7240 or COMP6240) and (COMP6730 or
COMP7230 or COMP6710).
COMP3425: (COMP1100 or COMP1130 or COMP1730) and
COMP2400.
Incompatible courses COMP3420, COMP8400, COMP8910
Co-Taught courses COMP3425, COMP8410
Course Convenor Prof Kerry Taylor
Phone 6125 8560
Email comp8410@anu.edu.au
Office hours for consultation Thursdays from 2pm until lecture commencement
Research Interests Semantic Web, Machine Learning, Spatial and IoT data analysis
Administrator CECS Student Services
Email studentadmin.cecs@anu.edu.au
Lecturer Kerry Taylor
Lead Tutor N/A
Email comp8410@anu.edu.au
Phone No phone contact
Other Tutors TBA
Textbook (not required) Han, Pei and Tong, Data Mining: Concepts and Techniques 4th
edition, 2023. https://www.elsevier.com/books/data-
mining/han/978-0-12-811760-6 . It is available in the university
library in soft and hard copy and at the on-campus bookshop. The
third edition would also be adequate.
Other recommended
references
Graham Williams, Data Mining with Rattle and R, The Art of
Excavating Data for Knowledge Discovery, Springer 2011.
http://www.springer.com/gp/book/9781441998897
Witten, Frank, Hall and Pal, Data Mining, Practical Machine
Learning Tools and Techniques, 4th Edition, Elsevier 2017.
https://www.elsevier.com/books/data-mining/witten/978-0-12-
804291-5
Mode of delivery
We aim to offer an effective learning environment for students attending activities in person.
Lectures will be delivered in-person. The format is interactive and we need attendance at those
lectures. Please refer to the University timetable for time and location. Lectures will also be
livestreamed and recorded. Most laboratories will be conducted in person with tutor support
although an option for on-line remote laboratory attendance is available for students who can
supply documentary evidence of inability to attend in-person due to unavoidable international travel
restrictions or visa delays. Exams will be conducted remotely online at a time to be announced. As
the global environment and our own understanding of needs and best solutions changes, there may
be operational change during the semester.
Throughout the course all dates and times are given in the Canberra time zone, i.e. at first AEDT
(UTC+11) but changing to AEST (UTC+10) on Sunday 2nd April 2023, see
https://www.timeanddate.com/worldclock/australia/canberra .
COMP8410 Learning Outcomes
Upon successful completion of this course, students will:
1. Critically analyse and justify the steps involved in the data mining process,
2. Anticipate and identify data issues related to data mining,
3. Research, test and apply the principal algorithms and techniques used in data mining,
4. Justify suitable techniques to use for a given data mining problem,
5. Appraise and reflect upon the results of a data mining project using suitable measurements,
6. Investigate application areas and current research directions of data mining,
7. Reflect upon ethical and social impacts of data mining.
COMP3425 Learning Outcomes
Upon successful completion of this course, students will:
1. Critically analyse and justify the steps involved in the data mining process,
2. Anticipate and identify data issues related to data mining,
3. Test and apply the principal algorithms and techniques used in data mining,
4. Justify suitable techniques to use for a given data mining problem,
5. Appraise and reflect upon the results of a data mining project using suitable measurements,
6. Reflect upon ethical and social impacts of data mining.
Assessment Scheme
Assessment components, weighting and due dates
Assessment Task Weight
%
Min to
pass
hurdle
Proposed Due Date Learning
outcomes
Weekly online quiz 1 N/A 11:59pm Wednesdays All
Assignment 1 14 30% 9am Monday Week 4 1, 6, 7
Mid-term exam 15 30% 6 pm Wednesday Week 5, TBC 1, 2, 3, 4, 5
Assignment 2 25 30% 9am Monday Week 10 1, 2, 3, 4, 5
Final exam 45 40% Exam period 1 June to 17 June, TBA All
OVERALL MARK 100
For assignment topics please see the Wattle site.
Online quizzes will be offered for most of the learning weeks. Quizzes open at 8am on
Monday of the relevant week and close at 11:59pm of the Wednesday of the following week
(i.e., open for 10 days). Quizzes are open-book and primarily intended for self-assessment.
Automated feedback on answers is given and multiple attempts are permitted. Marks for all
such quizzes will be totalled and scaled to contribute 1% to the overall course mark.
Combined, they are also intended as exam practice. If you do not attempt the quiz before
closing time it will not be available to you for subsequent exam revision.
The Mid-term (1 hour) and Final (3 hour) exams will be closed-book with no personal notes
or online materials permitted. However, access to the online course material on Wattle will
be permitted. Exams will be conducted remotely online secured by Proctorio, the ANU’s
remote examination product. Detailed information will be provided via the Wattle News
Forum.
Overall course mark
All assessment components, apart from the weekly quizzes, are “hurdles” under the ANU
Rules. A student must achieve at least the “min to pass” mark given in the table above for
every hurdle assessment to pass the course.
At least 50% overall is required to pass the course.
The contribution of each assessment component to the overall mark is given as a weight in
the table above.
If a deferral is granted for the Mid-term exam, no deferred exam will be held and the Mid-
term exam weight will be added to the Final exam weight for the overall mark.
A supplementary exam will be offered to any student who has an overall mark of at least
45% and either
o an overall mark of less than 50%; or
o has failed a hurdle assessment
Marks may be moderated so that raw marks for assessment components as well as overall
marks may be scaled by the convenor or as a result of school or college academic review.
Policy on late assessment and re-marking
Extensions to the due date for submission will be considered if requests are made to the
convenor at comp8410@anu.edu.au at least 3 days in advance, stating the reasons for
requiring the extension, formal evidence to support the reason (usually a medical
certificate), and the extension period requested. Only in very exceptional circumstances will
an extension be given without a request 3 days in advance of the due date. Extensions may
delay the release of marks.
Students are strongly encouraged to submit multiple drafts prior to the submission deadline,
so that in cases of last-minute difficulties, a submitted draft can be accepted and marked.
No other extensions are permitted. Late submissions will not be accepted nor marked.
Students may consider applying for special consideration. An application form must be
completed and lodged online within three business days of the original due date of the
assessment.
Any request for re-consideration of the marking for an assessment item must be submitted
within two weeks (14 elapsed days) of the assessment result being released. The procedure
for such requests will be advised on the Wattle News forum.
Academic Misconduct
Students are expected to have read the ANU Academic Integrity Rule before commencement of the
course. No group work is permitted in any part of the assessment in this course. Plagiarism will not
be tolerated and University procedures will be applied ruthlessly. Therefore your contributions are
expected to be yours alone, except for work that is clearly attributed appropriately. You may find
this a helpful guide to understanding what constitutes plagiarism and how seriously various
violations will be treated: http://thevisualcommunicationguy.com/2014/09/16/did-i-plagiarize-the-
types-and-severity-of-plagiarism-violations/
Every student is expected to be able to explain and defend a submitted assessment item. The course
convener may initiate an additional personal interview or assessment process about any submitted
assessment item at any time. If there is a significant discrepancy it will be treated as a case of
suspected academic misconduct.
Support for Students
The University offers a number of support services for students. Information on these is available.
Course Organisation
Please spend a while familiarising yourself with the Wattle course site.
You will see that there is a section for each week of the course. Sections may not be visible until the
respective time period commences, to help you pace your way through the course. You are expected
to work through the course notes by self-study or in self-organised study groups if you prefer.
Each section includes a description of the topic to be covered and extensive course notes. Most,
but not all, of the course material is sourced from the course text, Han, Pei and Tong. Reference to
relevant sections of the text are given so that you may refer to the text for alternative explanations
and extension material. For some topics, additional reading or detailed video explanations are
prescribed.
Usually there are paper-based exercises embedded within the notes to assist you to understand the
course topics. All the exercises are considered mandatory and examinable and you will have trouble
if you do not keep up with them. Some exercises build on the results of previous exercises.
There are also software-based practical exercises (titled “practical exercises”) embedded in the
course notes. You may do these exercises at your own pace if you prefer, or they may be
undertaken in labs with the support of your tutor. If you do not have time to complete them in your
scheduled lab, please do complete them outside classes. Either way, the practical exercises are
mandatory components of the course. In week 1 you will be asked to enrol in a lab at a time to suit
you and no labs will be held in week 1. Most labs will be of 1.5 hours’ duration but some should be
completed within an hour. Due to adjustments for potentially extra lab work during the semester,
you must be prepared to attend 1.5 hour laboratories every week.
Most sections also include an open-book self-assessment online quiz that you are advised to
attempt as many times as you need to gain confidence in more theoretical aspects of the course
topic. Your final mark for each quiz is automatically marked and contributes to the weekly quiz
component of the assessment scheme. You should attempt the quiz prior to your allocated lab class
and your tutor will work with you to clarify any issues in the class.
Lectures are scheduled for 2 hours on Thursdays at 4pm. The first hour of the lecture will normally
be an interactive Q&A session for the topic of the week, and, falling after the weekly labs, is
expected to be the final activity for that week’s topic, aimed at revision and consolidation.
Attendance and participation in the lectures is highly recommended for your own learning, as this
activity been strongly appreciated by students in the past. However, the session will not be effective
for learning if there is insufficient participation. In this case, the future lectures will be cancelled and
replaced by similar on-line video material. Remote students can participate fully via live streaming.
The second hour on Thursdays will introduce the topic for the following week and marks the
beginning of that topic in your weekly study pattern. Participation or viewing for this second hour is
optional if you prefer to rely only on the extensive on-line material.
Sometimes there will be an additional lecture on Monday at 9am, to extend the current topic or to
discuss assessment tasks. Notice will be given on the Wattle forum.
Please be aware that there is an additional 2-hour lecture scheduled for 8am Monday of Week 1
only. This lecture will orient you to the course and provide valuable concepts for understanding data
mining and for succeeding in the course. In addition, the first assignment will be discussed.
Communication and getting help
The course Discussion forum is the primary mechanism to raise questions or observations on the
course material and this will be monitored very frequently by the course convenor or tutors. Please
pay attention to course announcements on the News forum as these are sometimes critical for
course completion and assessment. Feedback will be provided for submitted assignments, generally
within two weeks of the due date.
Unless you are specifically directed, please do not contact course tutors outside scheduled classes
as they have been engaged to assist on specific tasks in the course. You may contact the course
convenor for private or personal matters using the contact information given at the top of this
document, but, to repeat, the Discussion forum is to be used as the primary method for engagement
with course staff. Generally, if you find that you do not understand something, or that it might be
erroneous, or that something is particularly interesting, many of your co-students will also find it
confusing, wrong or interesting, and we can all benefit from your post.
ANU is committed to the demonstration of educational excellence and regularly seeks feedback
from students. One of the key ways students have to provide feedback is through Student
Experience of Learning Support (SELS) surveys. The feedback given in these surveys is anonymous
and provides the Colleges, University Education Committee and Academic Board with opportunities
to recognise excellent teaching, and opportunities for improvement. For more information on
student surveys at ANU and reports on the feedback provided on ANU courses, see
http://unistats.anu.edu.au/surveys/selt/students/ and the course Programs and Courses page.
Once or twice during the course students may also be asked to complete a survey on specific
matters that can inform the remainder of the course or future course design.
Workload
An ANU 6 unit Course is designed for around 130 hours of student effort over the 12 weeks. For this
course, this includes 3 hours per week of semester for self-study when you are expected to work
through the extensive course materials posted on Wattle. Typically, 3-4 hours of lecture and
laboratory work is also required, although there is less in some weeks. The time budget also
includes assignment work. Any remaining time should be used for additional reading such as the text
book, recommended papers, self-study, and review and reflection.
Required Resources
A laptop or desktop with a reliable internet connection is required for accessing the course material
on Wattle and for completing the practicals, assignments and labs. Rattle and R will be used
extensively in this course so being able to install freely available software will be necessary. An
alternative is to have access to a laptop or desktop where appropriate software is already installed,
such as ANU CSIT student laboratories. Course software is also available by installing the Horizon
VMWare software and logging in to the CS Virtual Desktop Infrastructure (VDI) and this may be the
most convenient for students with good internet. A smart phone or tablet is unlikely to be sufficient.
Additional Course Costs
You could purchase one or more of the recommended books for this course. Successful completion
of the course does not require such purchase.
And finally, the convenor’s expectations of you as a learner
Kindly refer to the Learning Expectations for students of the School of Computing, provided on the
course Wattle site. Despite best efforts, some errors or confusing messages will slip through in the
course materials and in the course administration, and we encourage you to assist in their resolution
or improvement. Please, if you ask for clarification or correction of administrative matters or the
course material, we want you to ask it publicly via the Discussion forum or in lectures or labs so that
we can share the answer, for the benefit of all of us. We are intolerant of questions that have
already been addressed in this course outline, lectures, or forums, as we consider such questions to
demonstrate a lack of responsibility for your learning and disrespect for the teaching staff. We
expect this semester will hold new challenges for the learning and teaching for all of us engaged in
this endeavour, and we look forward to working with you as a team!