首页 >
> 详细

Homework 12

MATH 504

1. • Section 5.4 in Elements of Statistical Learning covers spline fitting with

penalty terms.

• Sections 12.3 and 12.4 in Sauer cover SVD.

2. This problem repeats the spline regression problems of hw 11, but now we

will use many knots. Specifically, consider the case 1000 knots equally spaced

between the minimum and maximum x values (ages) of the dataset. As before, for this problem let’s consider solely the female portion of the BoneMass

dataset (file attached).

B-splines are a choices of basis functions for splines that produce design/model

matrices that are not badly conditioned. In the setting of 1000 knots, such a

basis is essential. Choosing basis functions of the form hi(x) = ([x x ζi

]

+)

3 as

in the last homework will lead to huge condition numbers and the regression

will not be possible. This is an example of the importance of a good basis in

computation.

To manipulate B-splines, R provides the function splineDesign as part of the

splines package. Here are the two ways you will need to call splineDesign

to apply spline regression with a penalty method.

B <- splineDesign(knots=myknots, x=x, outer.ok=T)

Bpp <- splineDesign(knots=myknots, x=mygrid, derivs=2,

outer.ok = T)

Above, B will be a design matrix, meaning that Bij = bj (xi) where bj (x) is

the jth spline function in the B-spline basis and xi

is the ith x sample from the

dataset. Bpp is similar, except that given the flag derivs=2, Bppij = b

(i.e. the second derivative of bj (x)). See below for the meaning of mygrid.

Note: Using splineDesign, the dimension of the spline space is K K 4, where

K is the number of knots, rather than our usuual K + 4. splineDesign places

some constraints on the behavior of the splines at the end points that restrict

the dimension to K K 4 rather than K + 4. This issue is minor and does not

change the computations that must be done.. B will have dimension N by

K K 4 where N is the dimension of x. Bpp will have K K 4 columns, but the

number of rows will depend on the number of values in mygrid.

(a) Consider minimizing the following penalized least squares, which we discussed in classmin

where the zi are the points you specify in mygrid, b

00

`

(zi) is the i, ` entry

of Bpp, and ∆z is the distance between the points in mygrid. Make sure

mygrid forms a dense grid of values so that you get accurate estimates

using Riemann integration.

(c) Show that the matrix BT B is not invertible but that BT B + ρΩ is for

ρ > 0. You can do this by just computing the determinant for various

values of ρ in R or through theoretical arguments.

(d) Calculate α for (a) ρ = .01, (b) ρ = 1 and (c) ρ = 100. In each case

plot the smoothing spline (hint: B provides you discretized versions of

the bj (x), linearly combine them using α to form the fitted spline) and

the data points. Comment on the fit in each case.

3. This problem will demonstrate an application of svd. In the problem below,

the rows of the A matrix are formed from a single vector, with some added

noise. The rank of a matrix is the dimension of the span of its row or column

vectors (it turns out that the dimension is the same for row span and column

span). The noise means that the A matrix has a 10 dimensional rank, but

without the noise the A matrix would have a 1 dimensional rank. The svd

allows us to compute the lower rank matrix and extract the underlying signals

from A.

(a) The script signals.R constructs a 500 × 10 matrix, A, which is saved

to the file A.txt. Each row is of A has the form (q*sig + noise). sig is a

fixed signal, where the signal is a 10 dimensional vector. Also saved, in the

file no_noise_A, is a matrix with rows given by q*sig (i.e. no noise).

Finally, the file q gives the q value used for each row. Look through

signals.R and make sure you understand how A is constructed and

the from of sig. Plot the values in the first row of A, the first row of

no_noise_A, and the underlying signal. Can you tell what the signal is

by looking at A? By averaging the columns of A?

(b) Perform an svd on A using R’s svd function. Plot the singular values

and comment on their values given what we know about A. Consider

the approximation A1 of A, where A1 = s1u

(1)(v(1))

. Use image(A),

image(no_noise_A) and image(A_1) to visualize the three matrices

and confirm that A1 removes the noise from A. Compare the first row of

A, no_noise_A , and A1. Given a row of A in the form q*sig + noise,

what role do v

(1), s1 and u

(1) have in capturing q and sig?

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp2

- Cis 484作业代做、代写sql编程语言作业、代做sql课程设计作业、代写 2020-09-27
- 代写kit206课程作业、代做software留学生作业、代写c++程序语言 2020-09-27
- Comp2100作业代做、代写programming作业、C/C++编程设计 2020-09-27
- Msbd5015作业代写、Python编程语言作业调试、代写python课程 2020-09-27
- Programming作业代写、Java程序设计作业调试、代做algorit 2020-09-27
- Cisc 360作业代做、代写java程序设计作业、Python，C++语言 2020-09-27
- Cs 570留学生作业代做、Java程序语言作业调试、代写java课程设计作 2020-09-27
- 代做isys 1108作业、代做software作业、代写python，C+ 2020-09-27
- Data留学生作业代写、Java，C++程序设计作业调试、Python语言作 2020-09-26
- Csc220作业代做、Data留学生作业代做、代写java课程作业、Java 2020-09-26
- 代写csc220作业、代做java实验作业、Java程序语言作业调试、代做p 2020-09-26
- Bridging Coursework 2020-09-26
- Comp Sci 3004/7064 Operating Systems ... 2020-09-26
- Comp9311 20T2 - Assignment 2 2020-09-26
- Ipal Capstone Project 2020-09-26
- Ipal Programming In R - Week 2 Assign... 2020-09-26
- Csc 503/Seng 474 Data Mining Assignmen... 2020-09-26
- Assignment 1Cmpt 307 2020-09-26
- Csci 2300 Lecture Exercise 5 2020-09-26
- Csci 2300 Lab 4 2020-09-26