Simulation, Semester II 2021-2022
ST3247: Assignment 21
Due date: April 10th (Sunday of Week 12)
There are THREE questions in this question sheet.
Your answer sheet should contain answers to ALL questions. Do not leave answers in
your code file. Marking mainly depends on your answer sheet.
Please submit an answer sheet with an R file in the submission folder separately, and name
them with the same name. Please do NOT use compressed files.
The R file should contain your coding work as a supplementary file. Please make it ready
for reproductions.
For all the questions, you can only use standard uniform random variables unless
there are particular statements in the problem.
1. Suppose X1, X2, · · · , Xn ~ N(0, θ) where θ ~ Gamma(3, 0.5), that
p(θ) =
θ2e?θ/2
23Γ(3)
, θ > 0.
Note. For this question, you can use any random variable generation function in R, such
as rnorm(), rgamma() and so on. You can also use gamma() to calculate Γ(x) for any x.
(a) Find the posterior distribution that p(θ|x1, x2, · · · , xn), subject to a constant cp.
(b) We are interested in the mean of the posterior distribution, which is
E =
∫ ∞
0
θ ? p(θ|x1, x2, · · · , xn)dθ.
The posterior distribution is hard to sample even with the rejection algorithm. So
we want to use the importance sampling algorithm.
Suppose we have the random variables Y1, Y2, · · · , Ym ~ q, where q is a distribution.
Please write out the estimation in terms of m, p(θ) = p(θ|x1, x2, · · · , xn), q, and Yi’s.
1All rights reserved by NUS. Reproduction or distribution of lecture notes/tutorials/quiz/exam without the
written permission of the sponsor is prohibited.
1
(c) To get rid of the constants, let p = cpp0 and q = cqq0, where cp and cq are constants
and p0, q0 are kernel functions. Prove that∫
f(θ)p(θ)
q(θ)
q(θ)dθ =
∫ f(θ)p(θ)
q(θ) q(θ)dθ∫ p(θ)
q(θ)q(θ)dθ
=
∫ f(θ)p0(θ)
q0(θ)
q(θ)dθ∫ p0(θ)
q0(θ)
q(θ)dθ
.
(d) According to part (b) and (c), find the estimation in terms of m, p0, q0, and Yi’s.
Further, express them in terms of log p0, log q0 and Yi’s.
(e) We set the function q to be Gamma(cs2, c), i.e.
g(θ) =
ccs
2
θcs
2?1e?cθ
Γ(cs2)
, θ > 0.
Here, s2 is the sample variance and c > 0 is a parameter to decide. We use c to make
the function flat.
With the density functions of p and q, write R functions to calculate log p0 and log q0.
(f) With all the previous definitions of p, q and the discussion, write a pseudo-code to
estimate E.
(g) Generate 100 data points with distribution N(0, 5), denoted by x1, x2, · · · , x100. It
means the ground truth is θ = 5.
With your generated data, run your code to estimate E with c ranges from 0.04 to
20 with step size 0.04, m = 1000. Draw one plot of E? versus c and comment.
2
2. Messages arrive at a communications facility in accordance with a Poisson process having
a rate of λ = 2 per hour. The facility consists of three channels, and an arriving message
will either go to a free channel if any of them are free or else will be lost if all channels
are busy. The amount of time that a message ties up a channel is a random variable
that depends on the weather condition at the time the message arrives. Specifically, if the
message arrives when the weather is “good,” then its processing time is a random variable
having distribution function
F (x) = x, 0 < x < 1
whereas if the weather is “bad” when a message arrives, then its processing time has
distribution function
F (x) = x3, 0 < x < 1.
Initially, the weather is good, and it alternates between good and bad periods— with the
good periods having fixed lengths of 2 hours and the bad periods having fixed lengths of
1 hour. (Thus, for example, at time 5 the weather changes from good to bad.) Suppose
we are interested in the distribution of the number of lost messages by time T = 100.
(a) Set up a vector with length T , that decide the weather at time point 1 ≤ t ≤ T .
(b) Define the events and variables that enable us to use the DES approach.
(c) Write the pseudo code of how to update the variables for each iteration
(d) Implement the R code. Run your program to give an interval estimate with 95%
confidence of the expected number of lost messages in the first 100 hours of operation.
(No need to include code here.)
(e) Now I want to check the code under a simplified case. For the following, assume that
we are always in GOOD weather and there is only ONE channel.
i. What is the inter-arrival time between two messages?
ii. The status of this channel will be Busy, Idle,Busy, Idle, · · · . Busy means it is
processing a message and Idle means there is no message under processing now.
Let tB denote the length of one busy period. What is the distribution of tB?
What is E[tB]?
iii. Let tI be the length of one idle period. What is the distribution of tI?
Hint. tI is the time until next arriving message.
iv. The messages arrived during the busy period are lost. Let N denote the number
of lost messages. Let TB be the total busy time of the channel until T = 100.
3
Prove that E[N ] = λE[TB].
v. According to the property of stochastic processes, in the long run, the proportion
of all busy periods is approximately E[tB]/(E[tB] + E[tI ]). Let TB be the total
time that the channel is busy until T = 100. What is E[TB]?
vi. Find E[N ] according to parts (iv) and (v).
vii. Adjust your code for this case. Run 1000 simulations for T = 100. Compare
your results with the results in part (vi).
4
3. Let Q be a symmetric matrix, that is, qij = qji for all i, j. It satisfies the requirements for
a transition probability matrix, that every entry is non-negative and the row sum is 1.
Consider a Markov chain based onQ, where the transitions due to the following procedure,
Step 1. Generate Y ~ qi, where i is the current state. qi is the distribution defined by
i-th row of Q.
Step 2. Generate U ~ Unif(0, 1).
Step 3. If U ≤ bYbi+bY , set next step to be Y
Step 4. Otherwise, set next step to be the current state i.
Here, bj ’s are specified positive numbers.
(a) Say X0 = i, what is the distribution of X1?
(b) Find the transition probability matrix of the current Markov Chain.
(c) Actually this MC is time reversible. Please set up the local balanced equations and
find the corresponding limiting distribution pi.
(d) Consider the distribution on the permutations of (1, 2, 3, · · · , 10). The probability of
a permutation (x1, x2, · · · , x10) is that
P ((x1, x2, · · · , x10)) = C ?
10∑
i=1
xi/i,
where C is the normalising constant.
Please identify a procedure according to the results of this problem. Identify Q and
bi’s, and then write down the pseudo code.
(e) With the pseudo code in part (d), please run simulations to find the expectation of∑10
i=1 i ? xi. Write down the steps you get your samples (take that in part (d) as a
single function) and the result.