首页 > > 详细

MATH1041 Statistics for Life and Social Science

MATH1041 Statistics for Life and Social Science
Semester 2, 2018
MATH1041 Computing Assignment
Assignment release date: The assignment will be released to all students onWednesday
the 26th of September on Moodle (see \Assessments Information" section).
Submission due date: Thursday 11th October (Week 11) before 6pm (Sydney
time).
Please submit your assignment through Moodle, see the \Assessments Information" sec-
tion on Moodle for further information regarding online submission. You must submit
a neatly typed assignment converted to pdf format.
Data: A data set (in the text le format) will be sent to you via email at your ocial
university email address (see page 2 of this document for further details).
Assignment length: No more than SEVEN single-sided A4 pages including this
cover sheet as the rst page. Also, please make sure that you include your name and
zID somewhere in the assignment.
Q1 /6
Q2 /19
Q3 /17
Q4 /18
Total /60
1
Obtaining the data via email and reading it into RStudio
The data (that is, your data set) are available in a text le with a name similar to:
\z3141593.txt", (where z3141593 in the text le name is replaced by your unique student
zID number). This text le has been sent to you via email at your ocial
university email address. PLEASE CHECK YOUR UNIVERSITY EMAILS
REGULARLY TO MAKE SURE THAT YOU HAVE OBTAINED YOUR
DATA SET. Please email Dr Jakub Stoklosa (j.stoklosa@unsw.edu.au) if haven't
received your data set yet.
The rst step is to read the data into RStudio. The data format is simple and similar
to what you have already done in the Introduction labs. Follow the instructions given in
section R1.4 \How to import a text le into RStudio"of the RStudio \How-To-Manual"
available on Moodle. Once you've uploaded the data then you are ready to start your
analysis!
Computing assignment format
Here are some more details that may assist you:
 Regarding the overall assignment structure, this is up to you, just remember to keep
it clear and concise. If you are answering questions in the given order (that is, 1a),
b), etc.), then this is ne. You don't need to re-write the assignment question again.
 You are required to type up your entire assignment (rather than scanning and taking
screenshots). If you are using Word you should use the equation editor for any
maths notation. If you don't have Word then please use the School computers.
Please convert and submit your assignment in pdf.
 You are asked to produce SIX graphs/plots for this assignment. You are required to
produce these in RStudio. You may want to use the par(mfrow=c(2,3)) function
to construct all six graphs per plot (this is optional), see Section R1.4 \Transforming
data using RStudio"of the RStudio \How-To-Manual" available on Moodle.
 We recommend adding some working out for some of the questions involving calcu-
lations. But try to keep your solutions brief and concise (since there is a page limit).
It's good practice for the exam and in case you get the wrong answer you have some
workings to gain marks from. Your working could consist of RStudio commands or
perhaps the main steps on how you arrived at your answer. You don't need to add
all of your R-code!
 Keeping your results to 2 or 3 decimal places should be ne.
 There is no requirement for font size and line spacing but obviously don't make
things too small.
2
Scenario
A team of researchers were interested in studying the impacts of drought on sheep live-
stock in farms around New South Wales and Queensland, Australia. In particular, the
researchers wanted to compare the average body weight of sheep from ve years ago (when
there was little drought) to now (Spring, 2018) where drought is of serious concern.
To obtain their data, the research team decided to collect a random sample of sheep from
a very large sheep population on a farm a ected by the drought. This random sample of
data consists of sheep body weight measurements (measured in kilograms), head-to-tail
length measurements (measured in metres) and their gender (male/female).
The text le contains your unique data of length n in separate rows consisting of 3
variables: BW which corresponds to sheep body weights, HTL which corresponds to sheep
head-to-tail lengths, and SEX which corresponds to gender (0 = Female and 1 = Male).
Your job is to assist the research team by analysing the data set provided to you.
The Analysis Tasks
The questions you need to answer in your assignment submission are given below. Please
make sure your assignment is converted to pdf format.
1. (a) Calculate the sample mean and sample standard deviation of your sample of
sheep body weight (BW) measurements.
(b) Produce a normal quantile plot of your sample of sheep body weight mea-
surements (see Section R2.6 \How to produce a normal quantile plot using
RStudio"). Include this plot in your submitted assignment, properly labelled.
(c) By referring to the normal quantile plot obtained in Part 1b brie
y discuss if
the sheep body weights are approximately normally distribution.
2. Let  be the population mean body weight (in kg) of sheep (of any gender) on the
farm now (Spring, 2018). The research team decided to compare the current sheep
mean body weight with the mean from ve years ago. The known mean body weight
for sheep from ve years ago was 60kg.
(a) Test the hypothesis that  is equal to 60. You must summarize all steps:
state the null (H0) and alternative hypotheses (Ha) relevant to the research
objectives stated in this scenario, the value of a suitable test statistic, the
sampling distribution for this statistic, a P-value, your summary of signi cance
and conclusion in plain language.
(b) Some assumptions need to be made for the sampling distribution of the test
statistic (as given in Part 2a) to be valid. State these assumptions.
3
(c) Discuss whether the assumptions from Part 2b are satis ed?
(d) Produce a 95% con dence interval for , the mean body weight of sheep. For
this question you may assume that it is appropriate to use a t-distribution.
Make sure you write down all the required steps to calculate this interval.
Does this con dence interval include the value 60?
Explain whether your con dence interval is consistent with your conclusions
from the hypothesis test in Part 2a.
3. The research team were also interested in studying:
 the relationship between body weight and gender; and
 the relationship between body weight and head-to-tail length.
(a) Produce a comparative boxplot for sheep body weight against gender. Include
this plot in your submitted assignment, properly labelled.
(b) Describe any di erences or similarities in the distribution of body weight of
sheep for the di erent genders using your comparative boxplot from Parts 3a.
(c) Construct an appropriate graphical summary to visualize the relationship be-
tween body weight and head-to-tail length. Include this plot in your assign-
ment, properly labelled.
(d) Summarize the key features of your plot from Part 3c.
(e) Suggest an appropriate numerical summary to quantify the linear relationship
between body weight and head-to-tail length. Report and comment on this
value.
(f) The research team wanted to predict sheep body weight from head-to-tail
length measurement by tting a linear regression model. Would you recom-
mend the research team do this? Explain brie
y.
4. The research team decided to investigate the head-to-tail length (HTL) measurement
in more detail.
(a) Produce a ve number summary for the HTL measurements.
(b) Produce a histogram for the HTL measurements. Include this histogram in your
submitted assignment properly labelled.
(c) In MATH1041, we looked at the e ect of transforming data. Using the HTL
measurements, perform: (1) a log transformation; and (2) a square-root trans-
formation, and produce a histogram for each of these. Include these histograms
in your submitted assignment properly labelled.
(d) Summarize the key features of each histogram from Parts 4b and 4c (that is,
the raw data, and each of the transformations). Please comment on central
location, spread, and (any) skewness/symmetry.
(e) Do you think these transformations reduced any skewness? Explain brie
y.
 

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!