首页 > > 详细

Assignment 3 Answer all three (3) questions

 Assignment 3

Instructions: • Answer all three (3) questions of this assignment. Show all your work
• Compile your solutions using LATEX or Rmarkdown or Word as your base. Submit your assignment
parts into Crowdmark. Crowdmark accepts PDF, JPG, and PNG files. Crowdmark will allow for
group submission for Part 1 only.
• Presentation of solutions is important. Assignments should be word-processed and presently neatly.
Use proper statistical terminology and proper English language. Supporting output, such as unre￾quested R codes and extraneous output are optional and will not be graded. However, if you choose
to include these, please place in a separate appendix at the end of your assignment.
Grading: The grand total is 45 marks which includes 6 marks for excellent presentation. Best answers
will receive the best marks. A general grading rubric is given below.
Per Question Part
• 3 points: Complete, correct and clearly written
answers. Answers model individual prepara￾tion and academic honesty (where applicable).
• 2 points: Good answers that are unclear, con￾tain few mistakes or missing components. An￾swers demonstrate some individual prepara￾tion and some academic honesty (where ap￾plicable).
• 1 points: Poor answers or many missing com￾ponents. Most answers do not demonstrate
individual preparation or academic honesty
(where applicable).
• 0 points: Missing or incomprehensible answers.
Answers are not academically integral.
Presentation
• 3 points: well presented, easy to read, proper
English used, R code shown only where re￾quired.
• 2 points: good presentation, some unnecessary
R codes and unformatted output
• 1 point: poor presentation, handwritten, hand￾drawn diagrams, unnecessary R codes and un￾formatted output
• 0 point: illegible, missing, unclear presentation
1
1. (Adapted from Scheaffer et al.) A manufacturer of band saws wants to estimate the average repair
cost per month for the saws he has sold to certain industries. He cannot obtain a repair cost for
each saw, but he can obtain the total amount spent for saw repairs and the number of saws owned
by each industry. Thus, he decides to use cluster sampling, with each industry as a cluster. The
manufacturer selects a simple random sample of n = 20 from N = 96 industries he services. The
data on total cost of repairs per industry and number of saws per industry are as given in the table
below.
Industry Number of saws Total repair cost for past month ($)
1 3 50
2 7 110
3 11 230
4 9 140
5 2 60
6 12 280
7 14 240
8 3 45
9 5 60
10 9 230
11 8 140
12 6 130
13 3 70
14 2 50
15 1 10
16 4 60
17 12 280
18 6 150
19 5 110
20 8 120
(a) [3 marks] Estimate the average repair cost per saw for the past month and place a bound on
the error of estimation.
(b) [3 marks] To estimate the average repair cost per saw for the past month, how many clusters
(n) should the manufacturer select for his sample if he wants the bound (B) on the error of
estimation to be less than $2?
(c) [3 marks] Compare your results of part(a) with part(b), and comment on the relationship be￾tween n and B.
2. (Adapted from Scheaffer et al.) A market research firm constructed a sampling plan to estimate the
weekly sales of brand A cereal in a certain geographic area. The firm decided to sample cities within
the area and then to sample supermarkets within cities. The number of boxes of brand A cereal sold
in a specified week is the measurement of interest. Five cities are sampled from the 20 in the area.
Using the data given in the accompanying table, answer the following:
(a) [3 marks] Estimate the average sales for the week for all supermarkets in the area. Place a
bound on the error of the estimation. Is the estimator you used unbiased?
(b) [3 marks] Do you have enough information to estimate the total number of boxes of cereal sold
by all supermarkets in the area during the week? If so, explain how you would estimate this
total, and place a bound on the error of estimation.
2
Number of Supermarkets
City supermarkets sampled y¯i s2i 1 45 9 102 20
2 36 7 90 16
3 20 4 76 22
4 18 4 94 26
5 28 6 120 12
3. Use the population data set, hhw21.csv, with N = 210 pairs of measurements of handspan, x and
height, y from our class to mainly compare regression and ratio estimation for estimating the pop￾ulation mean height µy, using information from a sample of size n =10. Set the seed of your
randomization to be the digits of your student number.
(a) [3 marks] Obtain a simple random sample of the data and display it in a table. Include the
‘id’ numbers in your table. Show your R codes used to obtain your answers to this part and use
your sample obtained here to answer the remaining parts of this question.
(b) [3 marks] Using an SRS estimator, estimate µy and place a bound on the error of estimation.
(c) [3 marks] Using a ratio estimator, estimate µy and place a bound on the error of estimation.
(d) [3 marks] Using a regression estimator, estimate µy and place a bound on the error of estimation.
(e) [3 marks] Using a difference estimator, estimate µy and place a bound on the error of estimation.
(f) [3 marks] Find the error of estimation, |µˆ ^ µy| for each of the four estimators in parts (b) to
(e) and compare them.
(g) [3 marks] Which of the three estimators of parts (c) to (e) would you recommend? Explain.
(h) [3 marks] Do you recommend the SRS estimator over the other three estimators? Explain.
 
联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!