首页 >
> 详细

1. (Wine Data Set) These data are the results of a chemical analysis of wines grown in

the same region in Italy but derived from three different cultivars. The analysis determined

the quantities of 13 constituents (including Alcohol, Malic acid, Ash, Alcalinity of

ash, Magnesium, Total phenols, Flavanoids, Nonflavanoid phenols, Proanthocyanins,

Color intensity, Hue,OD280/OD315 of diluted wines, and Proline) found in each of the

three types of wines. The sample size is 178. The dataset is available in the course site. The

main interest of this dataset is to study multiclassification of the three types of wines. Let yb

denote the predicted class of observations.

(a) Use nominal logistic regression in Section 2.3 to examine the multiclassification. The R

function is multinom. In addition, summarize the confusion table for y and yb, use macro

averaged metrics to evaluate recall, precision, F-measure, and then conduct performance

of classification.

(b) Use the methods in linear discriminant analysis and quadratic discriminant analysis to

obtain yb. In addition, summarize the confusion table for y and yb, use macro averaged

metrics to evaluate recall, precision, F-measure, and then conduct performance of classification.

(c) Use the support vector machine method to obtain yb. In addition, summarize the confusion

table for y and yb, use macro averaged metrics to evaluate recall, precision, Fmeasure,

and then conduct performance of classification.

(d) Summarize your findings in (a)-(c).

2

2. (Simulation studies) Consider the following linear model:

y = X1β1 + X2β2 + X3β3 + X4β4 − 4√ρX5β5 + , (1)

where X = (X1, · · · , Xp) is a p-dimensional vector of covariates and each Xk is generated

from N(0, 1). The correlations of all Xk except X5 are ρ, while X5 has the correlation √ρ

with all other p − 1 variables. Suppose that the sample size is n = 200.

(a) Show that X5 is marginally independent of y.

(b) Now, consider p = 1500 and generate the artificial data based on model (1) for 1000

repetitions. Specifically, let βi = 1 for every i = 1, · · · , 5 and set ρ = 0.7. After that, use

the SIS and iterated SIS methods to do variable selection and estimate the parameters

associated with selected covariates. Finally, summarize the estimator in the following

table:

Table 1: Simulation result for (b)

k∆βk1 k∆βk2 #S #FN

SIS

Iterated SIS

(c) Here we consider the scenario that is different from (b). Let p = 40 and X ∼ N(0, ΣX)

with entry (j, k) in ΣX being 0.5

|j−k|

for j, k = 1, · · · , p. We generate the artificial data

based on (1) for 1000 repetition with βi = 1 for every i = 1, · · · , 5. After that, use the

lasso, adaptive lasso, and Elastic net (set α = 0.5) methods to estimate the parameters.

Finally, summarize numerical results in the following table.

Table 2: Simulation result for (c)

k∆βk1 k∆βk2 #S #FN

lasso

adaptive lasso

Elastic net (α = 0.5)

(d) Summarize your findings for parts (b) and (c), respectively.

Note: Let βb be the estimator, then ∆β is defined as ∆β = βb − β with the ith component

being βbi − βi

. Therefore, k∆βk1 and k∆βk2 are defined as

Hint: Regarding simulation studies with 1000 repetitions.

In Question 2, you are asked to use simulation studies with 1000 repetitions to estimate the

parameters. Specifically, based on the kth artificial data that are independently generated, you are

able to obtain the estimator, denoted by βb(k). As a result, with 1000 repetitions.

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codinghelp2

- 代写r留学生作业、代做data课程作业、代写r编程语言作业代做r语言编程|调 2020-05-25
- Cosc473作业代做、Systems作业代写、Python编程设计作业调试 2020-05-25
- Data留学生作业代做、R编程设计作业调试、R语言作业代写、Program课 2020-05-25
- Comp 250 Assignment 3 2020-05-24
- Macm 316 – Computing Assignment 7 2020-05-24
- Sta457 Assignment 2020-05-24
- Homework 10 2020-05-24
- Lab 2 Msc: Time Series Prediction With... 2020-05-24
- Comp2011作业代做、Data Analysis作业代写、C++编程语言 2020-05-24
- 代做compsys201作业、Python，Java，C/C++编程语言作业 2020-05-24
- Program留学生作业代做、Python编程设计作业调试、Data作业代写 2020-05-24
- 代写 Practical 3 Covid-19程序作业，代写... 2020-05-23
- 代写comp3059作业、代做programming作业、Java语言作业代 2020-05-23
- Coit12206作业代写、Program课程作业代做、Java、Pytho 2020-05-23
- Data2001作业代做、Data Science作业代做、Sql语言作业代 2020-05-23
- 代写comp2017作业、代写c/C++语言作业、代写data作业、C/C+ 2020-05-23
- Data留学生作业代做、Python编程设计作业调试、代写program课程 2020-05-22
- Mkan1-Uc 5103作业代写、代做analytics作业、Java，P 2020-05-22
- Pols 512作业代写、R编程设计作业调试、Data留学生作业代做、代写r 2020-05-21
- Econ 6070作业代做、Data课程作业代写、代做java，Python 2020-05-21