ST407程序辅导、辅导C/C++，Python程序

ST407 Monte Carlo Methods 2022/23
Assignment 2
Handed out: Thursday, November 24th, 2022
Deadline: 1pm on Thursday, December 1st, 2022
1. Consider the following Markov mixture model :
l(y1, y2, . . . , yn|μ1, μ2, k) =
k∏
p=1
φμ1,1(yp)
n∏
q=k+1
φμ2,1(yq),
where φμ,σ2 denotes the density of a normal random variable with mean μ and variance σ
2.
A scientist is interested in identifying a changepoint in the data which they have measured in a
scientific experiment. They believe that the change is most likely to have occurred somewhere
towards the middle of the measured time series, y := (y1, . . . , yn), and hence impose the following
prior distribution on k:
As they have little idea what change they expect to observe, they employ priors over the two means
which contain little information, with the following prior density:
fprior(μ1, μ2) = φ0,10(μ1)φ0,10(μ2).
They believe, a priori, that the location of the changepoint and the value of the two means are
independent of one another and so the overall (joint) prior distribution is simply the product of
these two marginal prior distributions.
(a) What is the posterior distribution1 of μ1, μ2 and k given a sequence of observations y up to a
normalising constant.
(b) What are the full conditional distributions of the three parameters of this model under this
posterior distribution?
(c) Implement a Gibbs sampler to sample from this posterior distribution. In doing so you might
find it useful to implement separate functions to sample from each of the three full conditional
distributions.
As an example, Figure 1 shows a function which provides samples from the full conditional
distribution of k given μ1, μ2 and the data. If you choose to use this function rather than
implementing your own then you should explain why this function returns samples from the
distribution which you derived in part b .
(d) The scientist obtains a short set of data from a simple proof-of-concept experiment (Data Series
1) and a longer sequence of data from the problem with which they were originally interested
(Data Series 2), both of these are illustrated in Figure 2.
This data can be downloaded in .Rdata format from the moodle website. Once downloaded to
the current directory, it can be loaded into your R environment using the load command. The
two data series are named dataset1 and dataset2, respectively.
1Recall that the posterior distribution of the parameters in Bayesian inference is the conditional distribution of those
parameters given the observed data, while the likelihood is the conditional distribution of the data given the parameters
and the prior distributions correspond to the marginal distribution of the parameters.
gibbs.k <- function(mu1 ,mu2 ,y) {
n <- length(y)
logprior.k <- log(c(1:( floor((n-1)/2)), n - (ceiling ((n/2)+1):n-1)))
logpost.k <- logprior.k
cr <- 0
for(i in 1:(n-1)) {
cr <- cr + dnorm(y[i],mean=mu1 , sd=1, log=TRUE)
cr <- cr - dnorm(y[i],mean=mu2 , sd=1, log=TRUE)
logpost.k[i] <- logpost.k[i] + cr
}
post <- exp(logpost.k - max(logpost.k))
post <- post / sum(post)
sample(n-1, size=1, prob=post)
}
Figure 1: An R function to sample from the full conditional distribution of k. (Note: the logarithmic
representation is used here for numerical reasons — the product of a large number of small values can
(b) Data Series 2
Figure 2: Two time series obtained by the scientist.
i. Run your Gibbs sampler on the first sample.
ii. Produce some simple plots and calculate the effective sample size of your output (using
the autocorrelation of the changepoint location) in order to assess the convergence. If you
think it necessary run the sampler again for a longer period; justify your decision.
iii. Compare the prior and estimated posterior distributions of the three parameters of interest2
(you might find it helpful to plot histograms or similar of your samples); what are the
posterior mean and variance of each of the parameters of interest?
(e) Repeat part (d) for the second data series; comment on any differences you observe.
(f) When presented with the above analysis, the scientist observes that there might actually be
more than one changepoint in the data sequence.
i. Write down a model in which the number of changepoints is itself unknown. To do this,
you must specify a collection of models: a prior distribution over the set of models as well
as prior distributions and likelihoods for each model.
Assume that there are at most 5 changepoints in a data series and that the number of
changepoints is, a priori, uniformly distributed over {1, 2, 3, 4, 5}.
Assume that the scientist uses a simpler prior distribution over the locations of the change-
points in this setting, one which is uniform over all possible configurations ofm changepoints
(conditional upon their being m changepoints). They use the natural generalisation of their
prior distribution over the means used within a single changepoint model above.
Begin by writing down the prior distributions and likelihoods for this setting.
ii. What is the full posterior distribution over both model and parameters and on what space
is this distribution defined?
iii. Assume that you wish to design a Reversible Jump algorithm which uses the Gibbs sampler
you have already designed to make within model moves. Design a simple pair of moves
which move between a model with a single changepoint and one with two changepoints,
being careful to specify what is sampled from which distribution in each case.
iv. Provide an expression for the acceptance probability of the move from one changepoint
model to the two changepoint model. You may assume that if the current state of the
Markov chain is a single changepoint model then a move to a two changepoint model is
proposed with probability ρ1→2 and otherwise a within model move is proposed and that if
the the current state is a two changepoint model then a move to a single changepoint model
is proposed with probability ρ2→1, one to a three changepoint model with probability ρ2→3
and one within the current model otherwise.
2We should always check with Bayesian analysis whether there is any information in the data: if the posterior and
prior distributions are very similar then we are very dependent upon our specified prior beliefs.