首页 >
> 详细

STAT 511 Exam2 – Spring 2020

Instructions (Please take a moment to read):

1. Students are expected to work independently on the exam. Do NOT discuss the exam with anyone

else. Do NOT post questions or comments about the exam to Canvas. Do NOT share R code or

notes or email regarding the Final Exam. Please consider fairness to your classmates as a guide of

conduct when working on the exam.

2. Consider using the R Markdown Template in Canvas, but NOT required. But, please be organized

and concise which will be 6 points of Exam (no need to spend too much time on the “perfect

document”, most submissions will receive 6/6). Also, the template may be useful guide for

organization even if copy/pasting to Word document or knitting to Word.

3. When including figures and tables please make them clear and concise. No need to go overboard on

detail, but correct formatting and essential labeling should be included.

4. Please make an effort to provide clear, concise, and coherent grammar for written response.

5. For any “hand” calculation questions, show your work in order to be eligible for partial credit. As a

general rule, round answers to 4 decimal places.

6. Use α = 0.05 for all questions (unless specified otherwise).

7. You may use any software, reference, or on-line resource that you find helpful.

8. If you have a specific question regarding content of the exam such as: interpretation of a question on

the exam, requirements for a response to a question, software issues with Rstudio, or R function that

continues to give errors, please send an email directly to me (). I will try to

respond in a reasonable time frame. Also, please make sure your Canvas settings allow

notifications when there are announcements on Canvas in case I need to clarify something on the

Final. But, I will likely not be responding to any email after the due date (for a while anyway).

9. The Exam must be submitted to Canvas in pdf format by 11:59 pm Wednesday 4/14/2020 using the

Please include your name on your submitted document as “signature” for Honor Pledge below.

Honor Pledge: I have not given, received, or used any unauthorized assistance on this exam.

Exam Parts:

Multiple Choice (32 pts)

True/False are 2 points, remaining 3 pts each as before.

Matching Question (14 pts)

Chapter 6 Problem: Sleep Data (12 pts)

Chapter 10 problem: Lefties (12 pts)

Chapter 8/9 Problem: Cuckoo Bird Eggs (24 pts)

Organization /Clarity (6pts)

2

1. Multiple Choice ( 32 pts)

For each numbered problem for this section, note the best answer choice in R Markdown or

Submitted document for each numbered question for Multiple Choice.

Questions 1 through 7 (True or False): For each question, just note in your submitted document the

question number and True or False. No need to justify. (Each True/False question is 2 pts,

remaining multiple choice are 3 pts).

1. Managing experiment-wise error rate is especially important when comparing means associated with

a very large number of treatment levels.

2. The LSD (unadjusted) pairwise comparison method helps control the experiment-wise error rate.

3. The HSD (Tukey) pairwise comparison method has lower power than the LSD (unadjusted) method.

4. For many cases, multiple comparison tests using Bonferroni’s adjustment can be considered too

conservative.

5. Dunnet’s method can be used to test all pairwise comparisons from a one-way ANOVA.

6. For a one-way ANOVA, subsequent multiple comparison adjustment methods are only viable when

the response variable is normally distributed across all treatment groups.

7. A survey yielded an estimated proportion of 0.11 based on a sample of size n=55. The large sample

normal approximation is adequate for this scenario. Use the criteria based on 3xSE.

8. In R, the function lm() performs which of the following?

(A) An ANOVA of specified data

(B) A linear model of specified response and predictor variables.

(C) A list of means for a specified response variable

(D) A likelihood maximized estimate for a specified response variable.

9. The multiple testing problem is best described by which of the following:

(A) Testing a hypothesized mean before a treatment, and then testing the mean again after a treatment.

(B) Having a large number of potential Type II errors when comparing many pairs of means between

treatment groups.

(C) Having a large number of potential Type I errors when comparing many pairs of means between

treatment groups.

(D) When performing an ANOVA, the degrees of freedom for the residuals (within) is considered to be

too large.

10. As a variable, the number of CSU graduate students who voted in the 2020 primary election is

best described by which of the following.

(A) Qualitative and Discrete

(B) Quantitative and Discrete

(C) Qualitative and Continuous

(D) Quantitative and Continuous

3

Suppose you collect data from four different populations and have the following summary statistics.

Use the table below to answer questions 11 - 12.

N Mean SD SE

Group A 45 76.54 19.45 2.90

Group B 44 78.45 32.01 4.83

Group C 43 79.65 57.21 8.72

Group D 42 81.32 84.43 13.03

11. If you performed an ANOVA using the data that generated the summary statistics above, which of

the following outcomes would you expect?

(A) A small F statistic and a small p-value

(B) A small F statistic and a large p-value

(C) A large F statistic and a small p-value

(D) A large F statistic and a large p-value

12. If you performed diagnostics for a fitted model in order to do an ANOVA using the data that

generated the summary statistics above, which of the following would you expect?

(A) There would be no need to perform diagnostics since the ANOVA assumptions are violated with

unequal sample sizes.

(B) The p-value from a Levene’s test is likely to be relatively small

(C) There is certainly going to be problems with the data when plotted on a qqplot

(D) The means are likely to be significantly different

13. A pharmaceutical company's allergy medication is known to provide relief to 75% of the people

who use it. The company wants to see if a new, improved version of the medication works even

better. In a test of the hypotheses H0: = .75 versus HA: > .75, the p-value is .32. Which of the

following gives the best interpretation of this p-value?

(A) There is a 32% chance that the new medication is more effective than the old medication.

(B) There is a 32% chance that the new medication and old medication are equally effective.

(C) If the new medication is more effective than the old medication (if HA is true), there is a 32%

chance of obtaining the observed sample proportion or something greater due to natural sampling

variation.

(D) If the new medication and old medication are equally effective (if H0 is true), there is a 32% chance

of obtaining the observed sample proportion or something greater due to natural sampling

variation.

4

Matching Problem (14 pts)

Below are statistical tests covered in Chapter 6 up to Chapter 10. Match each named test to the most

appropriate scenario below. Assume all data are collected through random sampling methods. Please

list the scenarios in your submitted work and match the corresponding letter. Each letter used once.

No explanation is necessary.

A. Levene’s Test

B. Welch-Satterthwaite Test

C. #- Test

D. ANOVA

E. Paired T-test

F. Kruskal-Wallis Test

G. Tukey’s Method

Scenario 1: A physical assessment called VO2 -max measures fitness levels by determining the

volume of oxygen a person can use in respiration during physical activity. A researcher wants to

see if VO2-max on average is different for those who live at high altitude versus those who live at

lower elevations. A random sample of 45 active people between the ages of 30 and 40 are selected

from several residents in the mountains of Colorado (above 9,000 ft). A random sample of 53

similarly active people are selected who live on coastal areas of California. On inspection of the

boxplot, it appears that the variances of VO2-max from each group are quite different.

Scenario 2: Using the same data in Scenario 1, the researcher would like to use a test and p-value

that strengthen evidence that the variance between the two samples are really different.

Scenario 3: A researcher wishes to compare means for 6 groups for which the standard deviations

within each group appear very similar. She would like all pairwise comparisons to be based on

honestly significant differences. She finds that the fitted model of response versus group levels has

residuals which are distributed approximately Normal (0, %).

Scenario 4: A researcher would like to determine if 3 treatments of sample size 9 have the same

central value. The fitted model of response versus group levels has residuals that do not appear to

be normally distributed. But, it does appear that the variances for each group are very similar.

Scenario 5: A researcher would like to determine if caffeine helps sprinters run faster times.

Twelve runners are selected to run one lap as fast as they can, and their time is recorded. Each

runner then drinks 3 double espressos. Thirty minutes after drinking the coffee, each runner then

runs one lap once again as fast as they can, and their time is recorded. Assume the differences

between lap time for each runner before and after the being caffeinated is normally distributed.

Scenario 6: A researcher would like to determine if 3 treatments of sample size 13 have the same

central value. The fitted model of response versus group levels has residuals that appear to be

normally distributed and the variance for each group appear to be very similar.

Scenario 7: For quality control, a machine must manufacture a drug within a range for a certain

amount of active ingredient. A random sample of 50 tablets are measured to see if the standard

deviation of the amount of the active ingredient is below a certain value.

5

There is a R Markdown Template Available for these questions.

Sleep Data in R (12 pts , 3 pts/part)

1. There are many “built-in” datasets available in Base R. You may have found some that are used

when searching for help on the internet (the iris dataset for example is popular with graphing).

Other packages that you install into R likely have their own example data as well.

For this question we use the sleep data. It is important to first read the help for any dataset you

use! This help is rather limited, but does offer a bit of background as well as references.

To analyze, there is no data to load. For example, view the structure of these data with just:

str(sleep). After checking out this very old data set about a sleep medication, submit answers to the

following questions while including appropriate R code and output. But please make sure the

answer is clear in some narrative form as well (not just a list of output).

(a) Define the parameter(s) of interest. Use appropriate symbol(s) (or at least names of Greek

letters, no mark-up required) to write the hypotheses that was used in this study. (Read the

help carefully)

(b) Provide a boxplot by group. Also, briefly explain (1 or 2 sentences) why this boxplot on its

own might be misleading to a reader (This question is meant to help think about the hypotheses

above to be sure. Sometimes these types of studies will use boxplots still, but clarify what it

shows).

(c) Provide output for the appropriate test. Also, please note the value of the test statistic and

the p-value for the test you defined in part (a).

(d) State the conclusion of the test in terms of the context of the study.

Left-Handed, no data to load here either (12, 3 pts/part)

2. Before the 1980s, school children were encouraged (and sometimes even forced) to write with their

right hand as opposed to their left. As a result, only about 8% of Americans in the 1980s claimed to

be left-handed. Over time, the stigma associated with being left-handed and social pressure against

it have relaxed. To investigate whether the proportion of the population that is left-handed has

increased since the 1980s, a psychologist surveys a random sample of 150 Americans of whom 18

claim to be left-handed. At the 5% significance level, is this evidence that the proportion of

Americans that are left-handed is higher today than in the 1980s?

(a) Define the parameter of interest and state the null and alternative hypotheses.

(b) Find an appropropriate test statistic.

(c) Determine an appropriate p-value.

(d) Write a conclusion for this hypothesis test in the context of the study.

6

Egg sizes of Cuckoo birds.

3. The European common cuckoo bird is known for laying eggs in nests of other bird species (in

terms of data, these are HostSpecies, though not all are actual bird species descriptions). Nest

categories include meadowlarks, trees, hedges, robins, wagtails, and wrens. Researchers measured

the size of the cuckoo eggs (in mm) relative to the different types of nests in which cuckoos lays

eggs. They hope to see if there are possible differences in size according to hostspecies. Suppose

that measurements taken in this study represent random and independent samples. (total pts = 24)

This dataset is provided in the Exam2 page. Note that these data need to be put in long form. The

sample sizes are not equal, so you’ll need to deal with missing data when loading. Help for this is

provided in the template.

(a) Provide an appropriate summary plot.

(b) Provide approporiate summary statistics

(c) No need to include them, but check diagnostics for assumptions of this type of a analysis. In

a sentence or three note any concerns that might be troubling and support your answers.

(d) Regradless of model assumption concerns, provide an ANOVA table

(e) State a conclusion to the F-test in context of the study.

(f) Provide some type of compact display comparing the different host species and in a sentence

or two summarize the findings.

(g) Suppose that cuckoo clutch size (number of eggs laid) in nests of wrens and meadowlarks are

similar, and cuckoo clutch size in nests of robins and wagtails are similar. Researchers are

interested in seeing if the size of the host bird is also related to the average size of cuckoo eggs.

They wish to compare egg size differences between (wren vs meadowlarks) vs (robins vs

wagtails), which can also be thought of as an interaction contrast. In other words, wrens are

smaller than meadowlarks, but robins and wagtails are very similar in size. A significant

difference in these comparisons would suggest bird size may be related to cuckoo egg size while

controlling for clutch size. (6 pts, 3 each for following)

(i) Consider the appropriate contrast coefficients to write the appropriate null hypothesis using

the following parameters ( '()* , -)./0' , (012* , '.34.25).

(ii) Provide an estimate for this contrast. And provide a p-value for this contrast.

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-21:00
- 微信：codinghelp

- 代写 Lab 2: Threads 2022-05-10
- 辅导assessment 1. Present Your Client ... 2022-05-10
- 5Cce2sas辅导、Python，Java程序辅导 2022-05-10
- 代写brae Webb编程 2022-05-09
- 辅导csci 3110 Assignment 1 2022-05-09
- Mth2222 Assignment 2代写 2022-05-09
- Cse3bdc Assignment 2022辅导 2022-05-08
- 辅导cis 468、辅导java，Python编程 2022-05-08
- Comp Sci 4094/4194/7094 Assignment 3 D... 2022-05-07
- Cs 178: Machine Learning & Data Mining... 2022-05-07
- Data7703 Assignment 4 2022-05-07
- 讲解assignment 2: Databases 2022-04-25
- 辅导ait681 Static Analysis 2022-04-25
- Cse121 & Cse121l 编程辅导、辅导c++程序语言 2022-04-25
- 辅导iti1120 Bject-Oriented Programming 2022-04-25
- Cmt304语言辅导、辅导c++，Python编程 2022-04-25
- 辅导comp/Engn4528 Computer Vision 2022-04-24
- 辅导fin 2200 Bloomberg Investment Proj... 2022-04-24
- 辅导bism 7255 Uml Assignment 2022-04-23
- 讲解comp202 Programming Assignment 2022-04-23