首页 >
> 详细

1. (Wine Data Set) These data are the results of a chemical analysis of wines grown in

the same region in Italy but derived from three different cultivars. The analysis determined

the quantities of 13 constituents (including Alcohol, Malic acid, Ash, Alcalinity of

ash, Magnesium, Total phenols, Flavanoids, Nonflavanoid phenols, Proanthocyanins,

Color intensity, Hue,OD280/OD315 of diluted wines, and Proline) found in each of the

three types of wines. The sample size is 178. The dataset is available in the course site. The

main interest of this dataset is to study multiclassification of the three types of wines. Let yb

denote the predicted class of observations.

(a) Use nominal logistic regression in Section 2.3 to examine the multiclassification. The R

function is multinom. In addition, summarize the confusion table for y and yb, use macro

averaged metrics to evaluate recall, precision, F-measure, and then conduct performance

of classification.

(b) Use the methods in linear discriminant analysis and quadratic discriminant analysis to

obtain yb. In addition, summarize the confusion table for y and yb, use macro averaged

metrics to evaluate recall, precision, F-measure, and then conduct performance of classification.

(c) Use the support vector machine method to obtain yb. In addition, summarize the confusion

table for y and yb, use macro averaged metrics to evaluate recall, precision, Fmeasure,

and then conduct performance of classification.

(d) Summarize your findings in (a)-(c).

2

2. (Simulation studies) Consider the following linear model:

y = X1β1 + X2β2 + X3β3 + X4β4 − 4√ρX5β5 + , (1)

where X = (X1, · · · , Xp) is a p-dimensional vector of covariates and each Xk is generated

from N(0, 1). The correlations of all Xk except X5 are ρ, while X5 has the correlation √ρ

with all other p − 1 variables. Suppose that the sample size is n = 200.

(a) Show that X5 is marginally independent of y.

(b) Now, consider p = 1500 and generate the artificial data based on model (1) for 1000

repetitions. Specifically, let βi = 1 for every i = 1, · · · , 5 and set ρ = 0.7. After that, use

the SIS and iterated SIS methods to do variable selection and estimate the parameters

associated with selected covariates. Finally, summarize the estimator in the following

table:

Table 1: Simulation result for (b)

k∆βk1 k∆βk2 #S #FN

SIS

Iterated SIS

(c) Here we consider the scenario that is different from (b). Let p = 40 and X ∼ N(0, ΣX)

with entry (j, k) in ΣX being 0.5

|j−k|

for j, k = 1, · · · , p. We generate the artificial data

based on (1) for 1000 repetition with βi = 1 for every i = 1, · · · , 5. After that, use the

lasso, adaptive lasso, and Elastic net (set α = 0.5) methods to estimate the parameters.

Finally, summarize numerical results in the following table.

Table 2: Simulation result for (c)

k∆βk1 k∆βk2 #S #FN

lasso

adaptive lasso

Elastic net (α = 0.5)

(d) Summarize your findings for parts (b) and (c), respectively.

Note: Let βb be the estimator, then ∆β is defined as ∆β = βb − β with the ith component

being βbi − βi

. Therefore, k∆βk1 and k∆βk2 are defined as

Hint: Regarding simulation studies with 1000 repetitions.

In Question 2, you are asked to use simulation studies with 1000 repetitions to estimate the

parameters. Specifically, based on the kth artificial data that are independently generated, you are

able to obtain the estimator, denoted by βb(k). As a result, with 1000 repetitions.

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp2

- Csse1001 Assignment 3 2021-01-10
- Comp3506/7505 Homework 4 – Graph Algo 2021-01-10
- Unix & C Programming (Comp1000) Assign... 2021-01-10
- Ece 209 Program 3: Market 2021-01-10
- Informatics 1 — Functional Programming 2021-01-10
- Cisc/Cmpe 452/Cogs400 Assignment 2 2021-01-10
- Fit2100 Operating Systems Assignment #... 2021-01-10
- Csci 1100 — Homework 5 2021-01-10
- Comp9444 Neural Networks And Deep Lea... 2021-01-10
- Assignment Case: German Credit 2021-01-10
- 48024 Applications Programming Assign... 2021-01-10
- Cs 405/805-001: Computer Graphics Ass... 2021-01-10
- Cse 434, Sln 70608 — Computer Networks 2021-01-10
- Corpfin 2503 - Business Data Analytics 2021-01-10
- Cis 455 / 555: Internet And Web System... 2021-01-10
- Cs110留学生编程代写、代做c++程序实验、Program程序语言调试帮做 2021-01-10
- Csc8021程序代做、代写networks编程语言、代做c/C++，Jav 2021-01-10
- 代写program编程语言、代做python，C++，Java程序设计帮做j 2021-01-10
- R编程课程代写、代做program程序语言、R程序实验代做代写databas 2021-01-09
- Data编程设计代做、代写java程序语言、Java程序实验调试代写r语言程 2021-01-09