Homework 3
This homework is worth a total of 100 points.
Each question is worth 20 points.
1. What are the main motivations for reducing a dataset’s dimensionality? What are the main drawbacks?
2. Suppose you perform. PCA on a 1,000-dimensional dataset, setting the explained variance ratio to 95%. How many dimensions will the resulting dataset have?
3. How can you evaluate the performance of a dimensionality reduction algorithm on your dataset?
4. How would you define clustering? Can you name a few clustering algorithms?
5. What are some of the main applications of clustering algorithms?