#
STATS 731作业代写、代做c+,java程序语言作业、代写Python课程作业
代做Python程序|帮做C/C++编程

STATS 731, 2020, Semester 1

Assignment 3 (5%)

Due: 5pm Friday 29th May, as a Canvas file upload

Question 1 [18 marks]

A few years ago, some colleagues and I measured the masses of the black holes in the centres of

some galaxies using a technique called ‘reverberation mapping’. The file black hole masses.csv

has the measurements. These are actually the log10 of the mass measurements in solar masses,

so 6 ≡ one million suns, 7 ≡ ten million suns, and so on. For simplicity, we’ll just call them

measurements, and the true log-masses we’ll just call masses. The stdev column is an estimate

of the likely size of the measurement error, such that

measurement[i] ∼ Normal

true mass[i], stdev[i]2. (1)

In this question you will use two BUGS models, which we’ll call the simple model and the

hierarchical model. The simple model is:

model

{

for(i in 1:length(measurement))

{

# Casual wide prior for each true mass

true_mass[i] ~ dnorm(0, 1/1000^2)

measurement[i] ~ dnorm(true_mass[i], 1/stdev[i]^2)

}

}

The hierarchical model is:

model

{

# Casual wide priors now apply to the hyperparameters

mu ~ dnorm(0, 1/1000^2)

log_sigma ~ dnorm(0, 1/10^2)

sigma <- exp(log_sigma)

for(i in 1:length(measurement))

{

true_mass[i] ~ dnorm(mu, 1/sigma^2)

measurement[i] ~ dnorm(true_mass[i], 1/stdev[i]^2)

}

}

In the hierarchical model, µ and σ are thought to describe the overall population of black holes

for which the observed ones can be considered a representative sample.

1

(a) [6 marks] Draw a PGM for the simple model and a PGM for the hierarchical model. For

the latter, I don’t mind whether you explicitly include the deterministic nodes, or merge

sigma and log_sigma into one node for presentation purposes.

(b) [2 marks] Run the simple model for a lot of iterations and obtain the posterior distribution

for the true log-mass of the first black hole. Summarise it using the posterior mean ±

the posterior standard deviation, which for a normal posterior is a 68% central credible

interval. The result should be obvious in hindsight.

(c) [2 marks] The hierarchical model’s prior has dependence between σ and the true mass

parameters. Modify the model so that it expresses exactly the same prior assumptions,

but the the prior has independence between all stochastic nodes. Hint: This is the

‘pre-whitening’ idea that we saw for the starling ANOVA model.

(d) [4 marks] Run either version of the hierarchical model for a lot of iterations and summarise

the posterior distributions for mu, sigma, and true mass[1] using any summaries you

think are appropriate.

(e) [4 marks] Explain why: (i) sigma has a greater than 50% posterior probability of being

smaller than sqrt(mean((data$measurement - mean(data$measurement))^2)); (ii)

the true mass of the first black hole is more likely to be above its measurement than below

it; and (iii) the uncertainty on the true mass of the first black hole is smaller with the

hierarchical model than with the simple model1

.

Question 2 [18 marks]

Consider the following simulated data which I generated in R using either runif() or rexp().

x = c(0.610164901707321, 1.99984208494425, 1.50817369576544, 0.707493807654828,

1.49413506453857)

In this question, perform model averaging/selection to try to infer whether I used runif() or

rexp(). Let U be the proposition that it was runif() and E be the proposition that it was

rexp(). If U is true, then the prior and sampling distribution are

log b ∼ Normal (0, 1) (2)

xi

| b ∼ Uniform(0, b) (3)

If E is true, then the prior and sampling distribution are

log λ ∼ Normal (0, 1) (4)

xi

| λ ∼ Exponential(λ) (5)

(a) [3 marks] Both models E and U imply prior predictive distributions for the data, and

hence the data mean x¯. Would the two prior predictive distributions for x¯ be the same or

different? Explain your answer.

(b) [2 marks] Part (a) implies that learning only x¯ would provide some information about

whether E or U is true. Does this seem reasonable to you?

1The phenomenon in (i) is known as ‘shrinkage’ and (ii) and (iii) are sometimes described as one unknown

quantity ‘borrowing strength’ from measurements of others.

2

(c) [4 marks] Write down analytical expressions for the marginal likelihoods p(x | U) and

p(x | E). Retain all constant factors.

(d) [4 marks] Numerically find the values of the two marginal likelihoods.

(e) [2 marks] Find the Bayes Factor (either way around) and also the posterior probabilities

of U and E, assuming prior probabilities of 1/2 each.

(f) [3 marks] If p(b | U) were made much wider, the Bayes Factor would strongly favour E.

Explain why this occurs.

3