首页 > > 详细

辅导 STA 141A — Fundamentals of Statistical Data Science辅导 Web开发

STA 141A Final Project

In this final project, you will be required to learm and apply a key machine learning algorithm-the ridge regression model, which generalizes the ordinary linear regression model by introducing a regularization term.

Reading

● The conceptual partis in 6.2.1 Ridge Regression from the book An Introduction to Statistical Learning.

● The coding session is in 6.5. 2 Ridge Regression and the lasso from the same book

Instructions

● Clean the given data set.

● Plot the standardized ridge regression coefficients against the hyperparameter λ. (refer to Figure 6. 4 (left) in the ISL book.)

■ Note that standardized means that you need to standardize the covariates.

● Answer the following discussion questions.

Grading (20 pts total)

● Data clearning: 5 points (2 4 issues)

● Modeling: 5 points (Ridge Regression and Linear Regression)

● Plotting: 5 points (Visualizations must be correct, clearly labeled, aesthetically clean)

● Discussion: 5 points

● Readability (deduction)

■ Code should be well-commented and clear.

■ Up to 2 points deduction for poor readability (e.g., unexplained code, no comments, hard to follow).

In [ ]:

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

# Import any packages you want to use below

Data Cleaning

Clean the given dataset first.

● indicate the potential problems (hint: >=4 issues)

● apply reasonable method to address these problems

In [ ]: # add more cells when needed

Plotting

Make the plots below

In [ ]: add more cells when needed

Discussion

1. What's the connection between the linear regression model and the ridge regression model? (hint: think about the additional term in ridge regression)

2. How to understand the parameter λ? (Hint: think how the model changes when the value of λ changes)

3. Why are we interested in the standardized coefficient? (Hint: think about what happens when it is not standardized)

4. Interpret your coefficient for x6 when λ=0. Is it the same as the linear regression coefficient (you need to run a linear regression model. with the same data and compare them)? Explain why.

In [ ]:

#run your linear regression model here

#add more cells when needed



联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!