Midterm Exam
Q1 Exam Logistic
0 Points
This exam is written to be completed in about 1-2 hours. To maximize flexibility, it will be available between Friday 04/25 at 9:00 am and Wednesday 04/30 at 3:30 pm. It is due strictly at 3:30PM PDT on Wednesday, April 30. You may NOT use late days on this exam.
You may work in multiple sessions, and submit as many times as you'd like (we only grade your last one).
You may work alone, or in groups of up to 2 (and submit a single exam). Group submission instructions:
First, have one person click "Save Answer" on any question. Then, scroll to the very bottom and click "Submit & View Submission".
You can now add your group members in the upper right corner, just like a programming assignment.
Then, you can click "Resubmit" to edit your answers as many times as you want, with the same group.
Be careful not to have two people edit answers at the same time (Gradescope may lose something). We recommend coordinating through a different platform. and having one person submit for the group.
This test is open-note and open internet. However, you are not permitted to share these questions or get help from anybody not enrolled in CSE/STAT 416 25sp, such as former students. Additionally, any work you turn in should be your own or of your groupmates, you should not turn in work written by other human beings or AI agents such as ChatGPT.
During the exam, you may ask clarification questions on Ed or in office hours. Course staff will not answer questions about course concepts or give hints on specific exam questions.
Any significant exam clarifications will be posted on Ed.
This problem asks you to affirm you and your group have read all of these policies and those on the website. Failure to fill this question out may result in your exam not being marked for credit.
Question 1: My group (if applicable) and I affirm that we have read over all exam instructions posted here and on the course website. Write the name of all group members here.
Q2
13 Points
Q2.1 Mean-Squared Error (MSE)
2 Points
A linear regression model y = f(x) = 2h1(x) + 5h2(x) − 2 is used to evaluate the following test set:
Example h1(x) h2(x) y
1 2 1 8
2 4 5 28
3 7 4 32
4 3 5 34
What is the mean-squared error (MSE) across these datapoints? Please give your answers to 2 decimal places.
Q2.2 L2 Regularized Error
1 Point
Recall that the formula for L2 regularized error is MSE(w^) + λ∥w^1:D∥2
2, where w^1:D refers to all the weights but the intercept.
What is the value of λ∥w^1:D∥2
2 for the above model in Q2.1, where λ = 2?
Q2.3 Hyper-Parameter Tuning for LASSO
1 Point
How should we select which value of to use for LASSO?
Choose the setting of λ that has the smallest MSE(w^) on the training set
Choose the setting of λ that has the smallest MSE(w^) on the test set
Choose the setting of λ that has the smallest MSE(w^) on the validation set
Choose the setting of λ that has the smallest λ MSE(w^) + λ∣∣w^∣∣1 1 on the training set
Choose the setting of λ that has the smallest λ MSE(w^) + λ∣∣w^∣∣1 1 on the test set
Choose the setting of λ that has the smallest λ MSE(w^) + λ∣∣w^∣∣1 1 on the validation set
Choose the setting of λ that results in the smallest coefficients.
Choose the setting of λ that results in the largest coefficients.
Q2.4
2 Points
In your own words, explain your answer to the above question. For full credit, you must justify your choice for the set to be used (train, test, or validation), and your choice for the quality metric to be used (MSE(w^) or MSE(w^) + λ∣∣w^∣∣1
1).
Q2.5 Changing Units
2 Points
James has a dataset, D1, with features about houses and their prices. He does not normalize these features. One such feature, h3(x) is "Age of House (years)." James trains a Ridge Regression model, f1, on D1.
Later, James decides to change the units of that feature h3(x) into "Age of House (decades)," and he calls the new dataset D2. In other words, D2 is the same dataset as D1, with only one feature changed. He then trains another Ridge Regression model, f2, on D2.
What do you expect to happen to the weight w3 in f2 corresponding to "Age of House (decades)" compared to the weight w3 in f1 corresponding to "Age of House (years)"?
w3 in f^
2 will be larger
w3 in f^
2 will be smaller
w3 in f^
2 will be the same
Q2.6 Feature Selection (Part 1)
2 Points
Taylor has a dataset with a large number of features, and she wants to use an algorithm to help select the most important features. Out of the following choices, select all of the reasonable approaches that she could take.
Greedy forward stepwise algorithm
Greedy backward stepwise algorithm
Ridge Regression for feature selection
LASSO Regression for feature selection
Q2.7 Feature Selection (Part 2)
2 Points
In your own words, explain your answer to the above question. For full credit, you must justify your selection for all four options above (greedy forward stepwise algorithm, greedy backward stepwise algorithm, Ridge regression for feature selection, and LASSO regression for feature selection).
Q2.8 Bias and Variance
1 Point
What happens to bias and variance as a model becomes more complex?
They both decrease
They both increase
Bias increases, variance decreases
Bias decreases, variance increases