首页 > > 详细

辅导MAST30027: Modern Applied Statistics

Assignment 2, 2022.
Due: 11:59pm Sunday September 11th
? This assignment is worth 7% of your total mark.
? To get full marks, show your working including 1) R commands and outputs you use, 2)
mathematics derivation, and 3) rigorous explanation why you reach conclusions or answers.
If you just provide final answers, you will get zero mark.
? The assignment you hand in must be typed (except for math formulas), and be submitted
using LMS as a single PDF document only (no other formats allowed). For math formulas,
you can take a picture of them. Your answers must be clearly numbered and in the same
order as the assignment questions.
? The LMS will not accept late submissions. It is your responsibility to ensure that your
assignments are submitted correctly and on time, and problems with online submissions are
not a valid excuse for submitting a late or incorrect version of an assignment.
? We will mark a selected set of problems. We will select problems worth ≥ 50% of the full
marks listed.
? If you need an extension, please contact the tutor coordinator before the due date with
appropriate justification and supporting documents. Late assignments will only be accepted
if you have obtained an extension from the tutor coordinator before the due date. Under
no circumstances an assignment will be marked if solutions for it have been released. Please
DO NOT email the lecturer for extension request.
? Also, please read the “Assessments” section in “Subject Overview” page of the LMS.
Note: There is no unique answer for this problem. The report for this problem
should be typed. Hand-written report or report including screen-captured R codes
or figures won’t be marked. An example report written by a student previous year
has been posted on LMS.
Data: The dataset comes from the Fiji Fertility Survey and shows data on the number of children
ever born to married women of the Indian race classified by duration since their first marriage
(grouped in six categories), type of place of residence (Suva, urban, and rural), and educational
level (classified in four categories: none, lower primary, upper primary, and secondary or higher).
The data can be found in the file assignment2 prob1.txt. The dataset has 70 rows representing
70 groups of families. Each row has entries for:
? duration: marriage duration of mothers in each group (years),
? residence: residence of families in each group (Suva, urban, rural),
? education: education of mothers in each group (none, lower primary, upper primary, sec-
ondary+),
? nChildren: number of children ever born in each group (e.g. 4), and
? nMother: number of mothers in each group (e.g. 8).
1
We can summarise data as a table as follows.
> data <- read.table(file ="assignment2_prob1.txt", header=TRUE)
> data$duration <- factor(data$duration, levels=c("0-4","5-9","10-14","15-19","20-24","25-29")
> , ordered=TRUE)
> data$residence <- factor(data$residence, levels=c("Suva", "urban", "rural"))
> data$education <- factor(data$education, levels=c("none", "lower", "upper", "sec+"))
> ftable(xtabs(cbind(nChildren,nMother) ~ duration + residence + education, data))
nChildren nMother
duration residence education
0-4 Suva none 4 8
lower 24 21
upper 38 42
sec+ 37 51
urban none 14 12
lower 23 27
upper 41 39
sec+ 35 51
rural none 60 62
lower 98 102
upper 104 107
sec+ 35 47
5-9 Suva none 31 10
lower 80 30
upper 49 24
sec+ 38 22
urban none 59 13
lower 98 37
upper 118 44
sec+ 48 21
rural none 171 70
lower 317 117
upper 200 81
sec+ 47 21
10-14 Suva none 49 12
lower 99 27
upper 58 20
sec+ 24 12
urban none 75 18
lower 143 43
upper 105 29
sec+ 50 15
rural none 364 88
lower 546 132
upper 197 50
sec+ 30 9
15-19 Suva none 59 14
lower 153 31
upper 41 13
sec+ 11 4
urban none 108 23
lower 225 42
upper 92 20
sec+ 19 5
rural none 577 114
lower 481 86
upper 135 30
sec+ 2 1
20-24 Suva none 118 21
lower 91 18
2
upper 47 12
sec+ 13 5
urban none 118 22
lower 147 25
upper 65 13
sec+ 16 3
rural none 756 117
lower 431 68
upper 132 23
sec+ 5 2
25-29 Suva none 310 47
lower 182 27
upper 43 8
sec+ 2 1
urban none 300 46
lower 338 45
upper 98 13
sec+ 0 0
rural none 1459 195
lower 461 59
upper 58 10
sec+ 0 0
Problem: We want to determine which factors (duration, residence, education) and two-way
interactions are related to the number of children per woman (fertility rate). The observed number
of children ever born in each group (nChildren) depends on the number of mothers (nMother) in
each group. We must take account of the difference in the number of mothers (hint: one of the lab
problems shows how to handle this issue). Write a report on the analysis that should summarie the
substantive conclusions and include the highlights of your analysis: for example, data visualisation,
choice of model (e.g., Poisson, binomial, gamma, etc), model fitting and model selection (e.g., using
AIC), diagnostic, check for overdispersion if necessary, and summary/interpretation of your final
model.
At each step of you analysis, you should write why you do that and your interpretation/conclusion.
For example, “I make an interaction plot to see whether there are interactions between X and Y”,
show a plot, and “It seems that there are some interaction between X and Y”.

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!