首页 >
> 详细

EECS6127–Winter 2020–2021

Assignment 1

Due: 11:59:59 PM, February 6, 2021.

A full score on this assignment is 25 marks. However there are 28 marks achievable on

this assignment (up to 3 bonus marks). To gain full marks, explain your answers carefully

and prove your claims.

1. Bayes classifier and Bayes risk

Recall that we model the data generation as a probability distribution D over X ×

{0, 1}. D can be defined in term of its marginal distribution over X , which we will

denote by DX and the conditional labeling distribution, which is defined by the

regression function

µ(x) = P

(x,y)∼D

[y = 1 | x]

Let’s consider a 2-dimensional Euclidean domain, that is X = R

2

, and the

following process of data generation: The marginal distribution over X is uniform

over two square areas [1, 2] × [1, 2] ∪ [3, 4] × [1.5, 2.5]. Points in the first

square Q1 = [1, 2] × [1, 2] are labeled 0 (blue) and points in the second square

Q2 = [3, 4] × [1.5, 2.5] are labeled 1 (red), as in the illustration below.

(a) Describe the density function of DX, and the regression function, Bayes predictor

and Bayes risk of D.

(b) Consider the two distributions D1 and D2

that we obtain by “projecting” onto

each of the axes. Formally, we are marginalizing out one of the features to

obtain D1 and D2

. Both are distributions over R× {0, 1}. Describe the density

functions of D1

X

and D2

X

, and the regression functions, Bayes predictors and

Bayes risks of D1 and D2

.

1

(c) Consider the hypothesis classes Hinit and Hdecst from Question 2 on this assignment.

Determine the approximation errors of Hinit for D1 and D2 and the

approximation error of Hdecst for D.

(d) For the classes Hinit and Hdecst consider their closures under function complements

and again determine the approximation errors on D, D1 and D2

.

3 + 3 + 3 + 3 marks

2. VC-dimension

We define hypothesis classes Hinit of initial segments over domain X = R and Hdecst

of decisions stumps over domain X = R

2 as follows:

Hinit = {ha | a ∈ R}, where ha(x) = 1[x ≤ a] ,

and

Hdecst = {h

i

a

| a ∈ R, i ∈ {1, 2}}, where h

i

a

((x1, x2)) = 1[xi ≤ a] .

Further, for a hypothesis class H ∈ {0, 1}

X , we define the class of complements of

H, denoted by Hc

, as the class where we flip all the labels of the functions in H,

that is

H

c = {h

c

| h ∈ H} where h

c

(x) = |h(x) − 1|.

Finally , we let Hcc = H ∪ Hc denote the closure of H under complements.

(a) Determine the VC-dimensions of Hinit and Hdecst.

(b) Show that for every hypothesis class H, we have VC(H) = VC(Hc

).

(c) Determine the VC-dimensions of Hcc

init and Hcc

decst.

(d) Show that, for every k, there exists a hypothesis class H with VC-dimension

k and VC(Hcc) = 2k + 1.

Hint: consider domain X = N and classes

Hk = {h ∈ {0, 1}

X

| |h

−1

(1)| ≤ k}

That is Hk is the class of functions that map at most k natural numbers to 1

and the remaining ones to 0.

3 + 3 + 3 + 3 marks

3. Empirical and true risk

Recall Claim 2, from Lecture 2. We showed that, over an uncountable domain,

there exists a learner A (the “stubborn learner”) and a distribution D, such that

for every sample size m and all samples S from Dm:

|LS(A(S)) − LD(A(S))| = 1

This is not true for countable domains (as we will show later in the course). For

this exercise, we will show a similar, but weaker statement for countable domains.

Without loss of generality, we can assume X = N. Prove that, for every sample size

m, and every > 0, there exists a learner A and a distribution D over X × {0, 1},

such that for all samples S from Dm we have

|LS(A(S)) − LD(A(S))| ≥ 1 −

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp2

- Cs2461-10实验程序代做、代写java，C/C++，Python编程设 2021-03-02
- 代写program程序语言、代做python，C++课程程序、代写java编 2021-03-02
- Programming课程代做、代写c++程序语言、Algorithms编程 2021-03-02
- 代写csc1-Ua程序、代做java编程设计、Java实验编程代做 代做留学 2021-03-02
- 代做program编程语言、代写python程序、代做python设计编程 2021-03-02
- 代写data编程设计、代做python语言程序、Python课程编程代写 代 2021-03-02
- Cse 13S程序实验代做、代写c++编程、C/C++程序语言调试 代写留学 2021-03-02
- Mat136h5编程代做、C/C++程序调试、Python，Java编程设计 2021-03-01
- 代写ee425x实验编程、代做python，C++，Java程序设计 帮做c 2021-03-01
- Cscc11程序课程代做、代写python程序设计、Python编程调试 代 2021-03-01
- 代写program编程、Python语言程序调试、Python编程设计代写 2021-03-01
- 代做r语言编程|代做database|代做留学生p... 2021-03-01
- Data Structures代写、代做r编程课程、代做r程序实验 帮做ha 2021-03-01
- 代做data留学生编程、C++，Python语言代写、Java程序代做 代写 2021-03-01
- 代写aps 105编程实验、C/C++程序语言代做 代写r语言程序|代写py 2021-03-01
- Fre6831 Computational Finance 2021-02-28
- Sta141b Assignment 5 Interactive Visu... 2021-02-28
- Eecs2011a-F20 2021-02-28
- Comp-251 Final Asssessment 2021-02-28
- 代写cs1027课程程序、代做java编程语言、代写java留学生编程帮做h 2021-02-28