首页 >
> 详细

Homework 12

MATH 504

1. • Section 5.4 in Elements of Statistical Learning covers spline fitting with

penalty terms.

• Sections 12.3 and 12.4 in Sauer cover SVD.

2. This problem repeats the spline regression problems of hw 11, but now we

will use many knots. Specifically, consider the case 1000 knots equally spaced

between the minimum and maximum x values (ages) of the dataset. As before, for this problem let’s consider solely the female portion of the BoneMass

dataset (file attached).

B-splines are a choices of basis functions for splines that produce design/model

matrices that are not badly conditioned. In the setting of 1000 knots, such a

basis is essential. Choosing basis functions of the form hi(x) = ([x x ζi

]

+)

3 as

in the last homework will lead to huge condition numbers and the regression

will not be possible. This is an example of the importance of a good basis in

computation.

To manipulate B-splines, R provides the function splineDesign as part of the

splines package. Here are the two ways you will need to call splineDesign

to apply spline regression with a penalty method.

B <- splineDesign(knots=myknots, x=x, outer.ok=T)

Bpp <- splineDesign(knots=myknots, x=mygrid, derivs=2,

outer.ok = T)

Above, B will be a design matrix, meaning that Bij = bj (xi) where bj (x) is

the jth spline function in the B-spline basis and xi

is the ith x sample from the

dataset. Bpp is similar, except that given the flag derivs=2, Bppij = b

(i.e. the second derivative of bj (x)). See below for the meaning of mygrid.

Note: Using splineDesign, the dimension of the spline space is K K 4, where

K is the number of knots, rather than our usuual K + 4. splineDesign places

some constraints on the behavior of the splines at the end points that restrict

the dimension to K K 4 rather than K + 4. This issue is minor and does not

change the computations that must be done.. B will have dimension N by

K K 4 where N is the dimension of x. Bpp will have K K 4 columns, but the

number of rows will depend on the number of values in mygrid.

(a) Consider minimizing the following penalized least squares, which we discussed in classmin

where the zi are the points you specify in mygrid, b

00

`

(zi) is the i, ` entry

of Bpp, and ∆z is the distance between the points in mygrid. Make sure

mygrid forms a dense grid of values so that you get accurate estimates

using Riemann integration.

(c) Show that the matrix BT B is not invertible but that BT B + ρΩ is for

ρ > 0. You can do this by just computing the determinant for various

values of ρ in R or through theoretical arguments.

(d) Calculate α for (a) ρ = .01, (b) ρ = 1 and (c) ρ = 100. In each case

plot the smoothing spline (hint: B provides you discretized versions of

the bj (x), linearly combine them using α to form the fitted spline) and

the data points. Comment on the fit in each case.

3. This problem will demonstrate an application of svd. In the problem below,

the rows of the A matrix are formed from a single vector, with some added

noise. The rank of a matrix is the dimension of the span of its row or column

vectors (it turns out that the dimension is the same for row span and column

span). The noise means that the A matrix has a 10 dimensional rank, but

without the noise the A matrix would have a 1 dimensional rank. The svd

allows us to compute the lower rank matrix and extract the underlying signals

from A.

(a) The script signals.R constructs a 500 × 10 matrix, A, which is saved

to the file A.txt. Each row is of A has the form (q*sig + noise). sig is a

fixed signal, where the signal is a 10 dimensional vector. Also saved, in the

file no_noise_A, is a matrix with rows given by q*sig (i.e. no noise).

Finally, the file q gives the q value used for each row. Look through

signals.R and make sure you understand how A is constructed and

the from of sig. Plot the values in the first row of A, the first row of

no_noise_A, and the underlying signal. Can you tell what the signal is

by looking at A? By averaging the columns of A?

(b) Perform an svd on A using R’s svd function. Plot the singular values

and comment on their values given what we know about A. Consider

the approximation A1 of A, where A1 = s1u

(1)(v(1))

. Use image(A),

image(no_noise_A) and image(A_1) to visualize the three matrices

and confirm that A1 removes the noise from A. Compare the first row of

A, no_noise_A , and A1. Given a row of A in the form q*sig + noise,

what role do v

(1), s1 and u

(1) have in capturing q and sig?

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp2

- Tsp课程作业代写、代做algorithms留学生作业、代做java，C/C 2020-06-23
- Kit107留学生作业代做、C++编程语言作业调试、Data课程作业代写、代 2020-06-23
- Sta302h1f作业代做、代写r课程设计作业、代写r编程语言作业、代做da 2020-06-22
- 代写seng 474作业、代做data Mining作业、Python，Ja 2020-06-22
- Cmpsci 187 Binary Search Trees 2020-06-21
- Comp226 Assignment 2: Strategy 2020-06-21
- Math 504 Homework 12 2020-06-21
- Math4007 Assessed Coursework 2 2020-06-21
- Optimization In Machine Learning Assig... 2020-06-21
- Homework 1 – Math 104B 2020-06-20
- Comp1000 Unix And C Programming 2020-06-20
- General Specifications Use Python In T... 2020-06-20
- Comp-206 Mini Assignment 6 2020-06-20
- Aps 105 Lab 9: Search And Link 2020-06-20
- Aps 105 Lab 9: Search And Link 2020-06-20
- Mech 203 – End-Of-Semester Project 2020-06-20
- Ms980 Business Analytics 2020-06-20
- Cs952 Database And Web Systems Develop... 2020-06-20
- Homework 4 Using Data From The China H... 2020-06-20
- Assignment 1 Build A Shopping Cart 2020-06-20