COMP5318/COMP4318 Machine Learning and Data Mining
s1 2025
Week 10 Tutorial exercises
Clustering 2
Exercise 1. DBSCAN clustering
Use the DBSCAN algorithm to cluster the items A1, A2, …, A8. The distance matrix is given below. Assume that Eps=2 and MinPts=2.
Exercise 2. Evaluating clustering quality using the silhouette coefficient
Given are 4 items P1, P2, P3 and P4. They were clustered using a clustering algorithm. The cluster labels and the distance matrix are shown below. Evaluate the quality of the clustering by computing the silhouette coefficient for each point, each of the 2 clusters and the overall clustering.
Distance matrix:
Cluster labels:
Exercise 3. Evaluating clustering quality using correlation
For the data from the previous exercise, evaluate the clustering quality using the correlation between the similarity matrix derived from the distance matrix (given below) and the similarity matrix derived from the clustering results (i.e. the matrix whose ij entry is 1 if two objects belong to the same cluster and 0 otherwise).
The similarity matrix derived from the distance matrix is given below. It was computed from the distance matrix as s =1 - (d - dmin)/(dmax - dmin), where dmin and dmax are the minimum and maximum distances in the matrix: dmin=0.1 and dmax=0.7.
Similarity matrix: