MSc Financial Mathematics
Statistical Methods and Data Analytics 2018
MATH0099
Problem Sheet 1
Let X0, X1, . . . , Xn be i.i.d. copies of a random variable X ∼ N(0, 1). The following two distributions that arise from transformations of standard normal random variables play a fundamental role in statistics.
• the Chi-squared distribution with n degrees of freedom is the distribution of X1
2 + . . . + X2n
. Its density is
2
−n/2Γ(n/2)−1
e
−z/2
z
n/2−1
, z ≥ 0.
• the t-distribution with n degrees of freedom is the distribution of Its density is
Problem 1. (t-statistics) Let X1, . . . , Xn be i.i.d. copies of X ∼ N(µ, σ2
). Show that the arithmetic mean ¯X and the sample variance S2n
, defined by
are independent random variables. Moreover, show that X ∼ N(µ, σ2/n) and (n − 1)S2 n
/σ2 ∼ χ
2
n−1
. Deduce that the t-statistic
is distributed according to the t-distribution with n − 1 degrees of freedom.
The median of a cumulative distribution function F is defined by m := F
−1
(1/2). Let X1, . . . , Xn be i.i.d. with cumulative distribution F and median m = 0 and suppose that F
0 (0) exists and is strictly greater than zero. Let Zn be the sample median, i.e. Zn := Xk where k = [n/2 + 1] and X(1) ≤ . . . ≤ X(n)
is an increasing ordering of the random variables X1, . . . , Xn. To solve the below problem you may assume the result that
P(
√
nZn ≤ x) → Φ(2F
0 (0)x), n → ∞.
Problem 2. (Comparison between arithmetic and sample mean) Let X1, . . . , Xn be i.i.d. copies of a random variable X ∼ N(µ, σ2
). The parameters µ and σ are unknown and ought to be estimated. Two possible estimators for µ are
where the random variables X(1) < X(2) < . . . < X(2n+1) are arranged in increasing order.
(a) Compute c
(1)n and c
(2)n such that
(b) Compute q ∈ R+, such that
How can q be interpreted? (In words.)
Problem 3. This exercise proves the Neyman-Pearson Lemma from lectures. The general problem of determining a most powerful statistical test with sufficiency level α is equivalent to determining an optimal 0 ≤ ϕ ≤ 1 such that
I(ϕp0) ≤ α, I(ϕp1) → max, (1)
where I denotes the integral with respect to the measure given by P0 + P1. Now follow the following steps:
(a) using the technique of Lagrange multipliers reformulate the problem (1) into an optimisation problem involving an appropriately chosen Lagrangian and Lagrange multiplier c;
(b) by considering the cumulative distribution function F of the random variable Q := p1(X)/p0(X) for X ∼ P0, show that there exists a Lagrange multiplier for the optimisation problem (1).
The above steps prove 1. in the statement of the Neyman-Pearson Lemma. The claims 2. and 3. are now easily shown by once again considering the optimisation problem.