MECH 203 Week 4

MECH 203

Week 4 Jupyter Notebook Written Report

Statistical Processes

Due date: Sunday of Week 5

Grading & Weight: This assignment is out of 100 marks, as further specified in the mark breakdown for

each question, and in the rubric below. The assignment is worth 9.5% of your overall final grade in the

course.

Late Penalty: Late submissions will be penalized at 10% each day for up to 5 days, in which case a

grade of zero will be given.

1. Overview

For this assignment, you will model a number of processes to help you get a better understanding of

random processes. You will answer 4 questions that will help you better understand these

processes. You will submit a report to onQ in the form of a Jupyter notebook before the deadline

specified.

This assignment directly aligns with the following Course Learning Outcomes (CLOs):

CLO 2: Explain random processes, including Gaussian, Poisson and binomial

CLO 3: Analyze random processes, including Gaussian, Poisson and binomial

1.1 Time for completion

This assignment will take approximately six (6) hours to complete. You will begin working on this

assignment in the two hour tutorial of Week 4, and finish on your own time before the deadline.

1.2 Instructions

For this assignment, you will answer the questions in order to improve your understanding of the

content that you have become familiar with this week. You will submit a report in the form of a

Jupyter notebook. Your Jupyter notebook must contain the following information:

• Completion of the tasks and discussion

o Include code required to correctly complete the task

o Include comments to explain your thought process and code logic

o Discuss the results of what was found and answer the questions in the task list

• Format

o See the Jupyter Notebook report template in onQ to format your report.

TASKS

Question 1

Context:

Take for the outcome of a hockey game, a win or a loss. Imagine a good team vs. a poor team. A good

team will have a higher probability of winning games (P=0.7) compared to the poor team (P=0.3) while a

middle of the pack team may have a win pct. of 50% (P=0.5). Now consider how these probabilities will

play out over the course of a single game, a lockout shortened season (42 games), and a full 82 game

season.

Steps:

1. Create a routine that uses a uniform random generator, from 0 to 1. For each game, draw a

random number. If the number is smaller than P, suppose that the team won a game. If not, it

lost. (use np.random.random, not a scipy.stats built-in distribution ) (3 marks)

2. Let’s consider individual games. Using the “simulator” developed in step 1, simulate 100

independent games, using

a. P = 0.3 (2 marks)

b. P = 0.5 (2 marks)

c. P = 0.7 (2 marks)

3. Draw three histograms, one for each value of P. Plot the number of times you counted a win and

the number of times you counted a loss. (2 marks)

4. Now, instead of simulating individual games, simulate 100 seasons, each with 42 games, using

a. P = 0.3 (2 marks)

b. P = 0.5 (2 marks)

c. P = 0.7 (2 marks)

5. Draw three histograms of the number of games won in each season, one for each value of P.

6. Simulate 100 seasons, each with 82 games, using

a. P = 0.3 (2 marks)

b. P = 0.5 (2 marks)

c. P = 0.7 (2 marks)

7. Draw three histograms of the number of games won in each season, one for each value of P. (2

marks)

8. Up until now, you’ve drawn 9 histograms. Using a scipy.stats built-in distribution, overlay the

appropriate distribution function over each histogram. (2 marks)

9. Last year, the Toronto Maple Leafs won 46 games. Calculate the probability that, over an 82-

game season, an average-strength team (P=0.5) wins 46 games or more. (2 marks)

10. During the 2015-2016 season, the Montreal Canadiens started the season with 9 consecutive

wins.

a. Calculate the probability for an average team (P=0.5) to win 9 games in a row. (2 marks)

b. Later that year, the team lost 21 out of 26 games. Calculate the probability

for an average team (P=0.5) to win 5 games or less out of 26 games. (2 marks)

c. What random process did you use to answer (a) and (b)? What are the conditions for

this process to be valid (i.i.d.)? Do you think that these conditions were met? (2 marks)

Question 2

Context:

Let’s think some more about hockey. More specifically, we’ll focus on the number of goals that a team

will score in a game.

Steps:

1. Suppose that a team scores, on average, 3 goals per game.

a. What is the rate of goal-scoring per second? (2 marks)

b. What is the rate of goal-scoring per second if the team scores, on average, 4 goals per

game? (2 marks)

c. What is the rate of goal-scoring per second if the team scores, on average, 5 goals per

game? (2 marks)

2. Build a Python method that generate a uniform random number from 0 to 1 for every second of

game played. If number is smaller than the rate, then a goal is scored. Each game is 60 minutes

long. Simulate 82 games. (Use np.random.random, do not use scipy.stat ) Draw a histogram of

the number of goals scored each game, the team scores, on average,

a. 3 goals per game (3 marks)

b. 4 goals per game (3 marks)

c. 5 goals per game (3 marks)

3. Using scipy.stat, generate random distributions which capture the simulations perform in step 2.

Overlay these distributions over the histograms. (2 marks)

4. How well do the distributions capture the histograms? What could be done to improve the

match between the distributions and the simulations? (2 marks)

5. Jan Bulis, a former Montreal Canadien player, scored, on average, 0.17 goals per game.

a. What is the probability for him to score 4 goals in a game? (2 marks)

b. In 2006, he scored 4 goals. How would you interpret this result? (2 marks)

Question 3

Context:

Enough with hockey. Let’s consider the production of a television. No two will likely be made the same

but they will all generally fall within a set tolerance window in the manufacturing process. They will also

be classified under the same energy star efficiency rating in which some tv's are more efficient than the

guideline and some are less. Let’s suppose that the power consumption of the TVs is normally

distributed. There are three processes to manufacture the TVs (process A, process B and

process C). Process A has a mean of 60 W and a standard deviation of 10 W. Process B has a mean of 60

W and a standard deviation of 5 W. Process C has a mean of 55 W and a standard deviation of 10 W.

Steps:

1. For each process, you take 5 TVs and measure their power consumption. Use the function

np.random.normal to generate 5 data points using each process. (1 mark)

a. By looking at the histogram, can you tell which process was used to generate each

dataset? (2 marks)

b. What if you measured 100 samples? (2 marks)

c. What if you measured 1000 samples? (2 marks)

2. Use the scipy.stat.norm method to generate continuous distributions. Overlay them to the

histograms generated in step 1. (2 marks)

Question 4

Context:

Not all data sets contain data in the shape of a standard distribution. Take for example a manufacturing

process which requires a machine to be calibrated at the start of each day. If this machine is calibrated

too far in either direction from 0 offset then every part produced in that day will be slightly over or

undersized. This can lead to a double peaked Gaussian distribution.

Steps:

1. Let’s generate some data!

a. Create a numpy array using np.random.normal, with process mean -2, standard

deviation 1.0, and size 10000. (2 marks)

b. Create a numpy array using np.random.normal, with process mean 2, standard

deviation 0.2, and size 2000. (2 marks)

c. Concatenate (using np.concatenate) the two arrays. Congrats, you now have data! (2

marks)

d. Plot a histogram of this data. (2 marks)

2. Can we provide a “smooth” representation of this concatenated data? Yes. There are a few ways

to do this. Here, let’s use a Gaussian kernel density estimate (scipy.stats.gaussian_kde). Plot the

Gaussian kernel density estimate. What does it represent? (2 marks)

3. Consider the concatenated array.

a. Calculate its standard deviation. (2 marks)

b. Draw 4 random data points from it. This is a sample of size n=4. (2 marks)

c. Calculate the sample mean of the sample drawn in step 3-b. (2 marks)

d. Now, draw 5000 samples, each of size n=4. Calculate the mean of each of these samples

(i.e. you’ll have 5000 means). Draw a histogram of these means. (2 marks)

e. Calculate the standard deviation of these 5000 sample means. (2 marks)

f. Draw 5000 new samples, each of size n=24 Calculate the mean of each of

these samples (i.e. you’ll have 5000 means). Draw a histogram of these means. (2

marks)

g. Calculate the standard deviation of these 5000 sample means. (2 marks)

h. Compare the standard deviations calculated in 3-e and 3-g to the analytical formula of

the standard error of the mean. (2 marks)

In addition to the marks above, your assignment will be evaluated using the following criterion:

Criteria Mastery (A+) High Quality (A) Developing (B) Marginal (C) Not Demonstrated (

Clarity of code

联系我们

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

热点文章

mgt202辅导、讲解 java/pytho... 2025-06-28
讲解 pbt205—project-based l... 2025-06-28
辅导 comp3702 artificial int... 2025-06-28
辅导 cs3214 fall 2022 projec... 2025-06-28
辅导 turnitin assignment讲解... 2025-06-28
辅导 finite element modellin... 2025-06-28
讲解 stat3600 linear statist... 2025-06-28
辅导 problem set #3讲解 matl... 2025-06-28
讲解 elen90066 embedded syst... 2025-06-28
讲解 automatic counting of d... 2025-06-28
讲解 ct60a9602 functional pr... 2025-06-28
辅导 stat3600 linear statist... 2025-06-28
辅导 csci 1110: assignment 2... 2025-06-28
辅导 geography调试r语言 2025-06-28
辅导 introduction to informa... 2025-06-28
辅导 envir 100: introduction... 2025-06-28
辅导 assessment 3 - individu... 2025-06-28
讲解 laboratory 1讲解留学生... 2025-06-28
辅导 ct60a9600 renewable ene... 2025-06-28
辅导 economics 140a homework... 2025-06-28

热点标签

msinm014/msing014/msing014b

联系我们 - QQ: 99515681 微信：codinghelp

程序辅导网！