MAST10010: Data Analysis 1
Assignment 2
Due Date: Friday September 23rd, 11.59pm.
Your assignment must be submitted to Gradescope by 11.59pm Friday
23rd September.
Assignments submitted late will incur a penalty of 5% per hour (or
part thereof).
If you need an extension then you must apply through the link on the
LMS.
Tutors may not help you directly with assignment questions. They
may, however, provide some appropriate guidance.
Please ask on the discussion board if you need clarification on the
wording of questions. Do not include partial answers on the discussion
board.
It is recommended to produce a single Word document which includes
all the relevant graphs, statistics and comments. You will then need
to Export as a PDF to upload to Gradescope. If you need to include
formulas or calculations, you may include photos of handwritten notes
(or use equation editor, or any other method).
This assignment consists of three (3) questions worth a
total of 35 marks. It contributes 5% towards your final
grade.
1
Instructions
Software:
You must use Minitab to produce any graphs, tables and descriptive statis-
tics.
Graphs:
must include your name/student number, which can be added by Edit-
ing the graph, right-clicking and selecting Add → Footnote or Add →
Subtitle.
must be relevant. You may look at many graphs, but you should only
include the most relevant graph for each question.
should be clear: ensure that labels and titles are correct and appro-
priate; you can add gridlines/change symbols/colour as appropriate to
make the graph clearer. There are some marks awarded for improving
upon the default from Minitab.
Mac Users: you will need to use myUniApps in order to edit the
graphs as required above.
Statistics:
Must be relevant: you will be penalised for including statistics which are
not relevant to the questions asked.
Comments:
must be in the context of the data.
should be supported by relevant statistics where possible.
should be concise and informative. Word limits, where given, must
be strictly adhered to (all word limits are a maximum, you will be
penalised for going over this limit!). You may use dot-points.
2
Question 1: Is there a difference in immune response between
two delivery methods for monkeypox vaccine?
[1 + 2 + 4 + 3 + 4 + 1 = 15 marks]
This question is based on simulated data for the study by Sharon E. Frey et
al. (2015) ‘Comparison of lyophilized versus liquid modified vaccinia Ankara
(MVA) formulations and subcutaneous versus intradermal routes of immu-
nisation in healthy vaccinia-na¨?ve subjects’, Vaccine, Volume 33 Issue 39,
5225–5234. You can find this article at:
https://www.sciencedirect.com/science/article/pii/S0264410X15008762, also
linked on the LMS.
You DO NOT need information from this article to answer the
questions; it is provided for interest only.
Monkeypox is a severe illness caused by a virus. Due to its similarity
with other pox-family viruses, the current strain circulating in non-endemic
regions (Monkeypox Clade IIb) is known to be prevented by the modified
vaccinia Ankara (MVA) vaccine, originally developed for smallpox.
This study, in part, examined two treatments:
Subcutaneous (SC) group a full dose of the MVA vaccine is injected into
the tissue layer between the skin and the muscle.
Intradermal (ID) group a one-fifth dose of the MVA vaccine is injected
just below the surface of the skin (between the epidermis and the
hypodermis).
We are only interested in the relative effectiveness of these two treat-
ments. The response variable measured is a measure of the vaccine effec-
tiveness, L2NT.1
The data is available as Asst2 2022 data.csv on the LMS Assignment
2 page.
(a). Explain why subcutaneous injection was chosen as the second treat-
ment, rather than a placebo.
(b). Produce an appropriate graph showing vaccine effectiveness for both
groups.
(c). Comment on the effect of vaccination method on L2NT. You should
support your comments with relevant statistics, but do not include
Minitab output.
Your comments must be less than 150 words.
1For those who chose to read the article: the response is actually
log2(neutralisation titre) at the 180th day after vaccination. This information is
not needed for the assignment.
3
(d). Calculate a 95% Confidence Interval for the difference in mean L2NT
for the two groups. Show all of your calculations (do not include
Minitab output, but you may use Minitab to obtain summary statistics
and relevant distribution values).
(e). What assumptions have you made in calculating this interval? Were
they satisfied? (You need to provide evidence, in the form of one graph
and a calculation.)
(f). Without doing further calculation, would a test of the hypotheses
H0 : μ1 ? μ2 = 0 and H1 : μ1 ? μ2 6= 0 be significant at the α = 0.05
level? Explain briefly.
Question 2: Political Polling
[2 + 2 + 4 + 2 + 3 = 13 marks]
This question is inspired by the report ‘Guardian Essential poll: Daniel An-
drews in strong position for Labor victory in Victorian election’, Benita Ko-
volos (2022)The Guardian Australia (https://www.theguardian.com/australia-
news/2022/sep/11/guardian-essential-poll-daniel-andrews-in-strong-position-
for-labor-victory-in-victorian-election), also linked on the LMS.
You DO NOT need information from this article to answer the
questions; it is provided for context only.
This report discusses the results from a poll of 536 people, where they
were asked about their voting intentions for the upcoming Victorian state
election in November 2022. The poll identified 189 people who were intend-
ing to give their first preference to the Australian Labor Party (ALP, the
current government) in the next election.
(a). Outline how you would collect a representative sample of a similar
size to this poll. Your answer will be assessed both on the statistical
validity and the practicality of your design.
(b). Construct a 95% confidence interval for the proportion of Victorians
who intend to give their first preference to the ALP.
(c). There were several undecided voters, with only 472 expressing an ab-
solute preference. Conduct an approximate Hypothesis Test (using
α = 0.05) to determine if there is a difference between decided vot-
ers in this poll, and the result of the previous election (where the ALP
received 43% of first preference votes). Show all of your calcula-
tions and steps.
Your answer needs to (the 5 step process meets these requirements):
? State the hypotheses in terms of the parameter(s) of interest.
4
Calculate sd(estimator).
Calculate the test statistic, and give its distribution under the
null hypothesis.
Give the P -value for the test, using Minitab (you should not use
Minitab for other parts of this question).
State your conclusion in the context of the data.
(d). An employee of the polling company is looking at the data on a seat-
by-seat basis. They find that in the seat of Ripon in Western Victoria
only 1 out of 7 respondents were planning on putting Labor as their
first preference. Would you advise them to use the same method as
used in part (c) to test if support for the ALP has changed? Why/why
not?
(e). It is believed that the true proportion of first-preference votes for the
ALP at the next election will be no more than 45%. Researchers
would like to estimate this proportion using an 80% confidence interval
based on a normal approximation, with a maximum margin of error
of 0.04. What sample size would be required to achieve this? Show
your calculations as well as your answer.
Question 3: Interpreting Research [3 + 2 = 5 marks]
This question requires you to interpret the following small section of the
article: Kathleen Gee, Mara Gonzalez and Carrie Cooper (2020) ‘Outcomes
of inclusive versus separate placements: A matched pairs comparison study’,
Research and Practice for Persons with Severe Disabilities, 45 (4) 223–240.
You can obtain this article from the Library website (online Journal search).
You DO NOT need information from this article to answer the
questions; it is provided for context only.
The students in inclusive classrooms were found to have a signif-
icant, large effect size as compared with the students in separate
classrooms for the following variables: being involved in an ac-
tivity that typical peers might be involved in at school (.58, p <
.001), being with typical peers (.62, p < .001), being socially en-
gaged (.58, p < .001), and being engaged in peer-to-peer learning
(.52, p = .002).
(a). For one of the four hypothesis tests in the quote, clearly state the
null and alternative hypotheses being tested. You may need to define
appropriate parameter(s); and you may assume that effect size is a
constant multiplied by the mean.
5
(b). The title of the study mentions it is a “matched pairs” comparison
study. Explain what this means in terms of this particular study.
Note: you do not need any additional information other than what is
given in this question.
Relevance, Formatting & Submission [2 marks]
You can gain an additional 2 marks by:
only including relevant material;
submitting a clearly legible assignment (eg all pages correct orienta-
tion);
selecting correct page(s) for each part of each question (when you
upload your assignment to Gradescope, it will ask you to select pages:
you can select multiple pages for a question part, you can also select
the same page for multiple parts).