首页 >
> 详细

Tutorial 5 – Probability, Statistics, nd Data Analysis

This tutorial presents some of the MATLAB functions used in probability calculations, basic statistics,

and data analysis. The discrete Markov chain model is introduced and its stationary points identified using

eigenanalysis, linking linear algebra and this class of stochastic model. Based on this, the PageRank algorithm

is discussed, which underpins Google’s websearch engine. Importantly, reading data from and writing data to

a file i s d emonstrated. Curve fitting is introduced, as is the correlation coefficient.

For a more detailed treatment of statistics and and machine learning topics, consult The Elements of

Statistical Learning: Data Mining, Inference, and Prediction, by Hastie, Tibshirani and Friedman (2009, 2nd

Ed, Springer Series in Statistics):

http://statweb.stanford.edu/~tibs/ElemStatLearn/printings/ESLII_print10.pdf

5.1 Combinatorics and Counting

Probability calculations often involve counting combinations of objects, and the study of combinations of

objects is the realm of combinatorics. Many formulas in combinatorics are derived simply by counting the

number of objects in two different w ays a nd s etting t he r esults e qual t o e ach o ther. O ften t he resulting

proof is obtained by a “proof by words.” Formulas are not necessarily derived from manipulations of factorial

functions as some students might think. Some of MATLAB’s combinatorial functions are illustrated in this

section, which are likely familiar to you.

5.1.1 Permutations and combinations

A permutation is an ordered arrangement of the objects in set S. If |S| = n, then there are:

n(n ? 1)(n ? 2) . . . (2)(1) = n!

different p ermutations o f t hese n o bjects. A n e xample i s t he s ize o f t he s et o f d ifferent or ders th at a group

n of people could appear in a queue, or that labeled balls are drawn from a bucket.

For these types of counting problems, MATLAB provides a factorial function. However, it is generally

preferable (because it is quicker and can be used for much larger numbers), to use the gamma function to

calculate factorials:

Γ (n) =

∫ ∞

0

xn?1e?xdx

and use the fact that Γ (n+ 1) = n! for positive integer values of n. For example,

f a c t o r i a l (5)

gamma(6)

ans =

120

ans =

120

More generally, a k-permutation arises when we choose k objects in order from a set of n distinct objects.

The total number of different permutations of size k of these n objects is:

n(n? 1)(n? 2) . . . (n? k + 1) = n!(n? k)!

1

Tutorial 4 ELEC2103/ELEC9103

where dividing by the number of remaining items (n?k)! truncates the factorial computation at the appropriate

point in the series.

Alternatively, we may not care about the order in which the objects are selected. There are two cases.

First, if we sample with replacement, the number of combinations is simply nk.

The second is referred to as sampling without replacement (i.e. as if we are picking balls from a bucket

and not returning them). In this case, we consider a combination of k objects dawn from n, which is written:(

n

k

)

. We can figure out the formula for

(

n

k

)

just by counting.

First, we know there are n! permutations of all of the objects; set this to the LHS of our expression. Next,

let us construct the RHS by first, counting the number of k-sized permutations of n objects by dividing the

n objects into a group of k selected objects and the remaining (n? k) objects. We know that there is a total

of k! permutations of the k selected objects, and likewise, a total of (n ? k)! permutations of the (n ? k)

remaining objects. Therefore, the total number of permutations of the n objects is:

n! =

(

n

k

)

k!(n? k)!

Rearranging this, we have: (

n

k

)

= n!

k!(n? k)!

different permutations of size k of these n objects. Note that we have derived this formula simply by

counting, not by expanding factorial functions. However, now we know the answer, we can also see that the

same expression can be found from the size of the set of k-permutations of n objects, divided by the number

of permutations of the drawn objects themselves, because we don’t care about their order:(

n

k

)

= n!

k!(n? k)! =

n(n? 1)(n? 2) . . . (n? k + 1)

k(k ? 1)(k ? 2) . . . (2)(1)

MATLAB provides the nchoosek function for calculating combinations, for example, for n = 5, k = 3:

nchoosek (5 ,3)

Exercise 1.

a) Verify that by using MATLAB’s factorial and nchoosek functions for a few example numbers, e.g.

(n=12,k=7) and (n=24,k=6).

b) Test the equation below using MATLAB’s nchoosek function and a few example numbers: (n=10, k=4),

(n=20, k=8). (

n

k ? 1

)

+

(

n

k

)

=

(

n+ 1

k

)

If you want to know how to prove the relationship above, read the following, otherwise just skip it.

Proof: By definition

(

n+ 1

k

)

, is the number of ways to choose k items out of n+1 items. We can break

the items into one group, Group A, with n items an another group, Group B, with only 1 item, and consider

the number of ways to choose items from these two groups. Assume we pick the 1 item from Group B, we

then have to choose k ? 1 items from Group A. The number of ways we can do this is clearly

(

n

k ? 1

)

. We

could also choose not to pick the 1 item in Group B, so that we have to select all items from Group A. The

number of ways we can do this is

(

n

k

)

. We must conclude then that the total number of ways to choose k

items out of n+ 1 items is the sum of these two cases, given by:

(

n

k ? 1

)

+

(

n

k

)

=

(

n+ 1

k

)

.

Page 2

Tutorial 4 ELEC2103/ELEC9103

5.2 Stochastic matrices and Markov chains

A Markov chain is stochastic discrete-time, discrete-state transition system. Markov chains are used extensive

to model systems with predictable, stochastic, state transistions, including activity models, target tracking,

regression models with mode- or regime-switching, for simulating wind and weather patterns, and in finance.

They are also used extensively as a computational routine for implementing Baysian models, which require

efficient resampling methods, an approach called the Monte Carlo Makov chain method.

A three-state Markov chain system is illustrated below:

1 2 3

Here, the three states are illustrated as circles, with chance transitions between them, as indicated by edges.

Each edge has associated with it a probability, such that at each time step, the probability of moving from

state s to state s′ is given by P (s′|s), and edges indicate transitions that have strictly positive probabilities.

Note that, in this example:

? not all states are directly linked with all others, but

? each state can be reached from all others by some path, and

? no two states have a periodic cycle between them.

When the second and third condition above hold, we say that the Markov chain is ergodic. We are going to

mix an application of probability and eigendecomposition to analyse the steady state behaviour of an ergodic

Markov chain.

Let the state-transition probabilities for the diagram above be given by a matrix:

P =

??0.5 0.5 00.2 0.5 0.3

0 0.5 0.5

??

The entries in P are read as the probability that, starting in an initial state corresponding to a given row,

the system moves to the state indicated by the column. Importantly, note that all rows sum to 1; this is

the definition of a stochastic matrix. Indeed, you can think of each row of a stochastic matrix as a valid

probability distribution, specifically a categorical distribution. (In the lectures, categorical distributions over

k possible outcomes were mentioned as the basic probability model underpinning multinomial distributions,

in the same way that a Bernoulli distribution over binary outcomes underpin the binomial distribution. What

we see here is a much more general model that includes a notion of state.)

Exercise 2.

a) Write a script encoding P as a MATLAB matrix, and use a logical test to check that it indeed is a

stochastic matrix.

b) An initial state x0 can be encoded as vector, with a value of 1 indicating the initial state and zeros

everywhere else. Write a vector x encoding an initial state of 1.

c) Compute the distribution of states of the Markov chain 2 time steps after starting in state 2 by multi-

plying again by P . Repeat the multiplication for 4, 10, 20 and 30 time steps. What is happening to

the resulting distribution of states?

Page 3

Tutorial 4 ELEC2103/ELEC9103

5.2.1 Stationary distribution of a Markov chain

A stationary distribution of a Markov chain is a probability distribution that remains unchanged in the Markov

chain as time progresses. Typically, it is represented as a row vector whose entries are probabilities summing

to 1, and given transition matrix P, it satisfies:

xP = x

In other words, x is invariant by the matrix P .

An important result in matrix theory known as the Perron–Frobenius theorem applies to stochastic matrices.

It concludes that a nonzero solution of the equation above exists and is unique to within a scaling factor. If

this scaling factor is chosen so that: ∑

i

xi = 1,

then x is a state vector of the Markov chain, and is itself a stationary distribution over states.

In the exercise above, we saw an iterative approach to computing the stationary distribution of a Markov

chain. This process is called the power method, for a reason that should be obvious.

However, note that the equation for the stationary distribution looks very similar to the column vector

equation Pq = λq for eigenvalues and eigenvectors, with λ = 1. Also you can test that P is positive-definite,

so P has an eigendecomposition — this is a general characteristic of stochastic matrices. In fact, for a

technical reason, we transpose the matrices to get them into an appropriate form for eigenanalysis. This

allows us to find the stationary distribution as a left eigenvector (as opposed to the usual right eigenvectors)

of the transition matrix. The operations are as follows:

(xP )T = xT = PTxT

In other words, the transposed transition matrix PT has eigenvectors with eigenvalue 1 that, when normalized

to sum to 1, are stationary distributions expressed as column vectors. Therefore, if the eigenvectors of PT are

known, then so are the stationary distributions of the Markov chain with transition matrix P. In an ergodic

Markov chain, this stationary distribution is unique.

Exercise 3.

a) Write a script to: (i) compute the left eigenvectors of P , (ii) check that there is at least one eigenvalue

of 1, and (iii) display the associated stationary distribution of the Markov chain P .

b) Compare the answers you got in a) to those from Exercise 2 c).

When there are multiple eigenvectors associated to an eigenvalue of 1, each such eigenvector gives rise to

an associated stationary distribution. However, this can only occur when the Markov chain is reducible (i.e.

can be broken into smaller independent chains).

5.2.2 PageRank algorithm (Google)

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-21:00
- 微信：codinghelp

- omp4650代做、辅导python设计程... 2023-08-15
- math3301代做、辅导python，ja... 2023-08-15
- 代写cisc181、辅导java程序语言 2023-08-15
- 代写mat334、辅导r程序语言 2023-07-24
- 代做comp9319 2023t2 assignme... 2023-07-23
- implement a machine learning... 2023-06-27
- comp9021 assignment 2 term 3 2023-06-27
- cse 12 programming assignmen... 2023-06-27
- csc3002 assignment 2 2023-06-27
- comp335代写、辅导sql程序语言 2023-06-26
- 代写cs 490 three-tier client... 2023-06-25
- cs 452 operating systems ph... 2023-06-13
- cs 452 operating systems pha... 2023-06-13
- math7502 final project 2023-06-13
- stat3006 assignment 3 class... 2023-06-13
- stat3006 assignment 3 class... 2023-06-13
- assignment 2 you are require... 2023-06-13
- csci 1100 — computer scienc... 2023-06-13
- mast30028 numerical methods ... 2023-06-13
- eng1005 s2 2022 assignment 3... 2023-06-13