首页 >
> 详细

STAT 337 ASSIGNMENT 2 Due: 5:00pm EDT Thursday, June 16, 2022

Notes for Submission: Upload your assignment directly to Crowdmark via the link you

receive by email. It is your responsibility to make sure your solution to each question is

submitted in the correct section, that the pages are rotated correctly, and that everything is

legible. Typed solutions are preferred.

Notes on the use of statistical software: Unless specifically told otherwise, you are free

to do your calculations using any software you like (SAS, R, Excel, etc) but your solutions

should clearly explain the steps you used in the computation, showing intermediate calcu-

lations when necessary, and give the formulas that you used. Any code and output created

should also be submitted.

1. [6 marks] In 2020, a group of eight articles were published in the Journal of Studies on

Alcohol and Drugs summarizing the current scientific literature and evidence related to

the research question: Does exposure to alcohol marketing have a causal influence on

youth drinking?1 For each statement below (lighted edited from the original source),

indicate which of the seven Bradford Hill criteria discussed in class are related to the

statement. Multiple criteria may be addressed in each case.

(a) Jernigan et al. (2017) conducted a systematic review of longitudinal studies that examined

exposure to advertising and drinking among underage persons. All 12 studies found a positive

association between marketing exposure and one or more alcohol consumption outcomes. For

initiation of alcohol use the odds ratios for di?erent marketing exposures ranged from 1.00 to

1.69, and for subsequent hazardous or binge drinking, the range was somewhat higher: 1.38 to

2.15.

(b) In recent years, psychologists have developed and tested theoretical models in which marketing

exposures are hypothesized to a?ect psychological mediators relating to thoughts, cognitions and

attitudes. These marketing-induced changes are hypothesized to predict whether an individual

will engage in drinking behaviour. Jackson and Bartholow (2020) provide a narrative summary

of psychological plausibility using an integrated conceptual model that depicts relevant psycho-

logical processes as they work together in a complex chain of influence.

(c) Hanewinkel et al. (2008) conducted a prospective observational study of 2110 German adoles-

cents younger than 15 years who had never smoked or drunk alcohol at baseline. The percentage

of students who tried smoking was 16.3%, 10.9% initiated binge drinking and 5.0% used both

substances during the follow-up period. There was a significant e?ect of parental movie restric-

tion on each substance use outcome measure after controlling for covariates. Compared with

adolescents whose parents never allowed them to view FSK-16 movies (movies that only those

aged 16 years and over would be allowed to see in theatres), the adjusted relative risk (RR) for

use of both substances were 1.64 for adolescents allowed to view them once in a while, 2.30 for

sometimes and 2.92 for all the time. FSK-16 restrictions were associated with substantially lower

exposure to movie depiction of tobacco and alcohol use.

1Sargent, J. D., Cukier, S., & Babor, T. F. (2020). Alcohol marketing and youth drinking: is there a causal

relationship, and why does it matter?. Journal of Studies on Alcohol and Drugs, Supplement, (s19), 5-12.

1

2. [10 marks]

(a) [4 marks] HIV disease may increase susceptibility to other viral infections. A co-

hort study investigated the association of HIV with the occurrence of cytomegalovirus

(CMV) infection, a common herpes virus. Researchers screened infectious disease

clinics to identify a cohort of 400 HIV-positive patients who were seronegative for

CMV. The researchers then identified a comparison cohort of 400 people without

HIV disease from primary care clinics who were also CMV seronegative. Study

personnel conduct annual testing to assess new CMV infections, defined by the

development of antibodies to the virus. The study data are presented in Tables 1

and 2.

For each of the six characteristics listed in Table 1 determine whether or not it is a

potential confounder for the association between HIV and incident CMV infection.

Explain your reasoning.

Table 1: Baseline characteristics of the study participants

HIV HIV

positive negative

Mean Age (years) 47.3 47.1

African American (%) 37.3 18.9

Male (%) 54.0 52.9

Mean Body mass index (kg/m2) 23.2 27.9

Intravenous drug use (%) 35.4 4.1

Mean CD4 lymphocyte count (cells/mm3) 187 1440

Table 2: Associations of study characteristics with incident CMV infection

Unadjusted relative risk

of CMV infection

HIV disease 4.05

Age (per 10-year higher) 2.92

African American (compared to Caucasian) 1.01

Male (compared to female) 2.05

Body mass index (per 5 kg/m2 higher) 1.03

Intravenous drug use (yes versus no) 1.86

CD4 lymphocyte count (per 100 cells/mm3 increase) 2.70

2

(b) The Heart and Estrogen/Progestin study (HERS) was randomized clinical trial of

hormone replacement therapy in post-menopausal women with existing coronary

heart disease (CHD)2. We will consider multiple linear regression models fit to

baseline data collected on the cohort of 2,763 women3. For the purposes of this

question, you can think of the data as coming from a cross-sectional study.

i. [3 marks] Consider the fitted multiple linear regression model presented in

Table 3. The response is LDL cholesterol and the primary exposure or vari-

able of interest is body mass index (BMI) (a continuous variable measured in

kg/m2). A set of potential confounders are also included in the model: age,

ethnicity (nonwhite), smoking, and alcohol use (drinkany). Age is a continu-

ous explanatory variables and the rest are binary explanatory variables. Give

a precise written interpretation of the regression parameter for the BMI term.

Is this result statistically significant?

ii. [1 mark] Using the model in Table 3 find the predicted LDL cholesterol value

for a 65 year old woman, who is white, doesn’t smoke but does occasionally

drink and who has a BMI of 24 kg/m2.

iii. [2 marks] Now consider the fitted multiple linear regression model presented

in Table 4. This model includes a binary indicator of statin use (a class of

drugs used to lower cholesterol levels) and the interaction between this vari-

able and BMIc. Note that the BMI variable has been centred its mean value

of 28.6 kg/m2 (i.e. BMIc=BMI-28.6). This makes the parameter estimate for

statin use more interpretable.

Using estimates from the fitted model, describe the association between BMI

(using BMIc) and LDL among statin users and non-users (2-3 sentences). Is

there evidence that statin use is an e?ect modifier for the association between

BMI and LDL cholesterol? Explain your reasoning.

2Hulley, S., Grady, D., Bush, T., Furberg, C., Herrington, D., Riggs, B. and Vittingho?, E. (1998). Randomized

trial of estrogen plus progestin for secondary prevention of heart disease in postmenopausal women. The Heart and

Estrogen/progestin Replacement Study. Journal of the American Medical Association, 280(7), 605-613.

3Vittingho?, E., Glidden, D. V., Shiboski, S. C., & McCulloch, C. E. (2011). Regression methods in biostatistics:

linear, logistic, survival, and repeated measures models. Springer Science & Business Media.

3

Table 3: Fitted multiple linear regression model from HERS study

MODEL LDL = BMI age nonwhite smoking drinkany

Parameter Estimates

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 147.3153 9.2564 15.91 0.000

BMI 1 0.3591 0.1341

age 1 -0.1897 0.1131 -1.68 0.094

nonwhite 1 5.2194 2.3237 2.25 0.025

smoking 1 4.7507 2.2104 2.15 0.032

drinkany 1 -2.7223 1.4989 -1.82 0.069

Table 4: Fitted multiple linear regression model with interaction from HERS study

MODEL LDL = statins BMIc statins BMIc age nonwhite smoking drinkany

Parameter Estimates

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 162.4052 7.5833 21.42 0.000

statins 1 -16.2530 1.4688 -11.07 0.000

BMIc 1 0.5821 0.1601 3.64 0.000

statins BMIc 1 -0.7019 0.2694 -2.61 0.009

age 1 -0.1729 0.1106 -1.56 0.118

nonwhite 1 4.0728 2.2751 1.79 0.074

smoking 1 3.1098 2.1670 1.44 0.151

drinkany 1 -2.0753 1.4666 -1.42 0.157

4

3. [10 marks] This question is based on the following paper:

Bulfone, T. C., Blat, C., Chen, Y. H., Rutherford, G. W., Gutierrez-Mock, L.,

Nickerson, A., ... & Reid, M. J. (2022). Outdoor Activities Associated with

Lower Odds of SARS-CoV-2 Acquisition: A Case-Control Study. Interna-

tional Journal of Environmental Research and Public Health, 19(10), 6126.3.

You can download the paper from https://doi.org/10.3390/ijerph19106126. The

following questions will lead you through a discussion of the design and a simple unad-

justed analysis of some of the data from this study.

(a) [1 marks] In your own words state the goal/purpose of this case-control study.

(b) [2 marks] Who are the cases in this study and how were they identified/selected?

Who are the controls and how were they identified/selected?

(c) [2 marks] Give two inclusion or exclusion criteria used in the selection of the cases

and controls above.

(d) [1 marks] What is the primary exposure of interest and how was it assessed?

(e) [2 marks] Using the data given in Table 2 calculate and interpret the (unmatched,

unadjusted) Odds Ratio for the primary association of interest in this study.

(f) [2 marks] Describe at least two potential limitations of this study and/or sources

of bias or error.

5

4. [12 marks] In this question you will explore matching in case-control studies. Consider

the data in Table 5 giving case counts for a rare disease D and a common exposure E

in a closed population, stratified by a common binary confounder X. This represents

the full data in your study population and is normally unobservable.

Table 5: Hypothetical study population

X+ X Overall

E+ E E+ E E+ E

Cases D+ 80 10 100 200 180 210

Non-cases D 80,000 20,000 20,000 80,000 100,000 100,000

Odds Ratio 2.0 2.0 0.86

Source: Pearce, N. (2016). Analysis of matched case-control studies. BMJ, 352.

(a) [2 marks] You and your colleagues decide to run an unmatched case-control study

to investigate the association between E and D. You include all 390 cases from

your population and a random sample of 390 controls. Recreate Table 5 for this

study. Use the true sample population prevalences to generate your controls. For

example, the number of controls with (E+, X+) will be 390 ? P [E+, X + |D].

(b) [4 marks] Calculate the stratum-specific and unstratified/overall Odds Ratios for

the data from your unmatched case-control study in (a) and compare them to the

true population values in Table 5. Supposed you ignored (or were unaware of) X

and based your analysis on the unstratified case-control data. Test the significance

of the unstratified Odds Ratio using a 2 test. Be sure to clearly state the null and

alternative hypotheses, give the formula for the test statistic, calculate its value

and find the p-value. What is the conclusion of the test? Would your conclusions

from this study accurately reflect the true association between E and D?

(c) [2 marks] Now suppose you and your colleagues decide to run a matched case-

control study. Once again you include all 390 cases and you match based on X.

Generate stratified and overall matched 2? 2 tables from this study. Assume,

given X, the exposure statuses of a matched pair are independent and based on

the true sample population prevalences. For example, for X+ there will be 90

matched pairs and the number of pairs with both the case and control exposed

will be 90 ? P [E + |D+, X+]P [E + |D, X+].

(d) [4 marks] Using the matched 2?2 table from (c) calculate the matched pair Odds

Ratio and compare it to the true population values in Table 5. Use McNemar’s

Test to test the significance of the association between E and D. Be sure to clearly

state the null and alternative hypotheses, give the formula for the test statistic,

calculate its value and find the p-value. What is the conclusion of the test? Would

your conclusions from this study accurately reflect the true association between E

and D?

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-21:00
- 微信：codinghelp

- Fit5217辅导、Python程序语言辅导 2022-05-31
- 辅导ecs 170 Introduction To Artificial... 2022-05-31
- 辅导ecs 170 Homework Assignment 5 2022-05-31
- Fit 5003 Software Security辅导 2022-05-30
- 辅导cse 101 Data Structures And Algori... 2022-05-30
- 辅导econ7150、辅导java，Python编程 2022-05-30
- Econ7150编程辅导 讲解 S1 2022 2022-05-29
- 讲解cse 101 程序 辅导 Data Structures 2022-05-29
- 辅导fit 5003 Software Security 2022-05-29
- Stat7055 Introductory Statistics For B... 2022-05-28
- Assignment 3 Description: Computer Sy... 2022-05-28
- 辅导laboratory 程序、辅导program编程 2022-05-28
- 讲解eece 1080C Programming For Ece 2022-05-28
- Comp10002 Foundations Of Algorithms辅导... 2022-05-28
- 辅导 Swen30006、辅导java/C++编程 2022-05-28
- Comp326讲解导、辅导python，Java程序 2022-05-28
- 辅导 Dungeon Crawler C++ - Assignment ... 2022-05-27
- 辅导mast30025 Linear Statistical Model... 2022-05-27
- Prog2002辅导、辅导sql语言编程 2022-05-26
- 辅导 Info411/911 Data Mining Knowledge... 2022-05-26