|
Module code and Title
|
DTS201TC Pattern Recognition and Computer Vision
|
|
School Title
|
School of AI and Advanced Computing
|
|
Assignment Title
|
Coursework (Individual technical report)
|
DTS201TC
Coursework
Students
Please save your assignment in a PDF document, and package your code as a ZIP file. Submit both the technical report and the code file via Learning Mall Core to the appropriate drop box. Electronic submission is the only method accepted; no hard copies will be accepted.
You must download your file and check that it is viewable after submission. Documents may become corrupted during the uploading process (e.g., due to slow internet connections). However, students themselves are responsible for submitting a functional and correct file for assessments.
Weight for the individual coursework: 50% Overview
In this coursework, the student needs to complete a technical report about wine origin prediction using K-means clustering classifier and mean feature ranking method.
Learning Outcomes:
C. Carry out classification vs. description, parametric and nonparametric classification, supervised and unsupervised learning
D. Utilise of contextual evidence, clustering, recognition with strings, and small sample-size problems
Avoid Plagiarism
• Do NOT submit work from others.
• Do NOT share code/work with others.
• Do NOT copy and paste directly from sources without proper attribution.
• Do NOT use paid services to complete assignments for you.
Technical Report Requirements:
Machine learning (ML) or Artificial Intelligence can learn from training data and it has demonstrated greater accuracy in nonlinear classifications and regressions. The student needs to load wine dataset to determine the origin of wines using Python packages and to use mean method to rank 13 features of the wine dataset. The wine dataset contains the results ofa chemical analysis of wines grown in three different regions in Italy. Specifically, it includes 13 attributes derived from measurements of various constituents found in the wines. These attributes typically include factors like alcohol content, acidity levels, and concentrations of different chemical compounds such as phenols and flavonoids. These attributes provide valuable insights into the chemical composition of wines and can be utilized for wine classification tasks. The dataset has 178 samples with 13 dimensions. This is defined as multiple classification (e.g., label= 0, label=1 and label=2). The student needs to rank the features using N-fold cross validation (e.g., N=3). One machine learning model (K-means clustering classifier) is tested to rank the 13 features using the mean feature ranking method. Please notice that the K-means Clustering classifier is an unsupervised learning model. The true labels of samples are used to calculate the prediction performance. Finally, the student needs to write a technical report (around 1000 words) to include the following sections:
Report Title: Wine Origin Prediction Using K-means Clustering with Mean Feature Ranking Method
Section 1: Introduction (10 marks)
The student needs to give a clear project background and project objectives in the section. The student needs to give the references (e.g., >=5) for the literature review in the report.
Section 2: The student needs to give the classification system design using mean feature ranking method. (20 marks)
2.1 The student needs to give a flowchart of the classification system design and description of the main steps. (10 marks)
2.2 The student needs to give a correct description of the mean feature ranking method. (10 marks) Section 3: Experimental results with analysis (40 marks)
3.1 The student needs to write a Python code to plot the first two dimensions of the features with different colors for three class labels. (10 marks)
3.2 Let’s fix the number of K=3 for K-means clustering classifier. The student needs to give the classification results using K-means clustering and the original 13 features (e.g., 13 features). The student needs to write a Python code to implement mean feature ranking method using the K- means (K=3) clustering classifier and the 3-fold (N=3) cross validation and to list the results in a table (e.g., Table 1). (20 marks)
Table 1: Wine data feature ranking results using K-means (K=3) clustering classifier and mean
feature ranking method
|
Accuracy using 3-fold cross validation and 13 original features
|
Accuracy using 3-fold cross
validation and mean feature ranking method
|
Accuracy
difference
|
Feature
ranking
|
|
e.g., Overall accuracy=
80%
|
e.g., accuracy (feature 1) = 60%
|
e.g., 20%
|
e.g., 2
|
|
e.g., Overall accuracy=
80%
|
e.g., accuracy (feature 2) = 70%
|
e.g., 10%
|
.
|
|
.
|
.
|
|
.
|
|
.
|
.
|
|
.
|
|
.
|
.
|
|
|
|
e.g., Overall accuracy=
80%
|
e.g., accuracy (feature 13) = 65%
|
e.g., 15%
|
|
3.3 The student needs to give a correct analysis based on test results. (10 marks)
The student needs to analyse the experimental results. If the mean feature ranking algorithm works well, the student needs to give a detailed analysis. Please discuss why K-means clustering is a nonparametric classification model and what are advantages and disadvantages using the K-means clustering classifier for the experiments in the section.
Section 4: Conclusion (20 marks)
4.1 The student needs to conclude what are advantages and disadvantages using the mean feature ranking method based on the experimental results. (10 marks)
4.2 The student needs to discuss if there are any better feature ranking methods in the section. The student needs to show that he/she has a deep understanding of the project. (10 marks)
Section 5: References and report quality. (10 marks)
The student needs to read 5 or more than 5 reference papers for the technical report. Please use the same format for the references. The student needs to avoid typing error/errors in the report and follow the instructions to write the report with clear and good English.
Note: The student needs to write around 1000 words for the technical report and provide the Python codes.