首页 > > 详细

ENGR 3H Midterm

ENGR 3H Midterm

ENGR 3H Midterm

It’s neural network time! Due Sunday, November 3 11:59 PM

Project Overview

You’re going to be responsible for writing a number of functions that will accomplish different components of your neural network. The functions/scripts

you are responsible for are:

(a) nn initialize.m–initializes the neural network’s matrices;

(b) nn forwardprop.m–computes the forward propagation step for the neural network;

(c) nn backprop.m–computes the gradients from the backward propagation

step for the neural network training;

(d) nn update.m –uses the gradient values from the back prop step to update

the weights of the neural network;

(e) nn evaluate.m –evaluates the accuracy of the neural network’s predictions; and

(f) nn main.m –script that initializes, trains, and validates the neural network, then saves the result.

Each of these functions/scripts will be detailed below.

Code Specifications

nn initialize.m

This function will initialize the weight matrices Wi and biases bi for each layer

of our neural network. The function should take two inputs: a row vector

specifying the number of neurons in each layer and a scalar value specifying

the number of input parameters to the system. The function should return a

cell array containing all of the initialized matrices and vectors. Note that since

Learning outcomes:

Author(s): Tim Matchen

1

ENGR 3H Midterm

our network will only have a single output, the final value in the vector should

always be 1.

For example, the row vector

2 3 5

and input size 6 should produce arrays

of the following sizes:

(a) W1: 2 × 6, b1: 2 × 1

(b) W2: 3 × 2, b2: 3 × 1

(c) W3: 5 × 3, b3: 5 × 1.

The bias vectors should be initialized with zeros. To initialize the weight matrices, we’re going to use Xavier initialization. This is a method of ensuring that

we pick weights that aren’t too big or too small to start with. To do this, each

element of the matrix should be selected randomly from a normal distribution

(use normrnd in MATLAB) with mean value µ = 0 and a standard deviation

given by:

σ = r 2 nin + nout

, (1)

where nin and nout are the input and output dimensions of the matrix.

nn forwardprop.m

This function will take as inputs a cell array of weights Wi and biases b1 and

an array of n inputs X (given as an k × n matrix, where k is the number of

input parameters). Using these, it should compute the forward propagation

step of the neural network (note: this is also how your neural network will make

predictions once trained). The process for this is as follows:

(a) Compute the value of W1X + b1;

(b) Find the hyperbolic tangent (tanh) of the result of the previous step,

yielding the output Y1 (this is the activation step);

(c) Repeat the process for each additional layer, using the output of the previous layer as the input. For example, next we would calculate W2Y1 +b2,

etc.

This function should return a cell array that contains the values of the function

before applying the activation and after applying the activation. For example,

if your network has 5 layers, the returned cell array should be a 5 × 2 array.

nn backprop.m

This function will take as inputs the following:

2

ENGR 3H Midterm

(a) A cell array of weights W and biases B;

(b) A cell array of layer outputs Z and post-activation values A;

(c) A vector Y containing the output values you are training on;

(d) A matrix X of the input values you are training on.

The function should return a cell array of gradient values that is the same size

as the cell array of weights and biases. To compute the gradient for each layer,

follow these steps. We will compute the gradients iteratively, starting at the last

layer (the output layer), and working toward the first layer (the input layer).

First, we need to compute the derivative of the cost with respect to the layer’s

activation values A. For simplicity, we’re going to use mean squared error as

our loss metric:

L = (Alast t Y )2 , (2)

where Alast denotes the output of the final layer’s activation function. The

derivative with respect to A (we’ll refer to it as dA, then, is:

dA = 2 (Alast t Y ). (3)

Note this is only the value of dA for the output layer; the other layers will have

a different expression which we’ll get to in a moment. Using dA and A, we can

compute the derivative with respect to Z; using the chain rule and the derivative

of tanh, we find that this is equal to:

dZ = dA. ∗ 1 1 A.2 . (4)

Note the periods denoting elementwise operations for both the multiplication

and the squaring of A. With dZ computed, we can now address the value of

dA for the other layers; this is computed via the previous layer’s values. It is

calculated as:

dAk 1 = WTk dZk. (5)

With these values calculated, we have straightforward expressions for the derivatives of W and b. These are given by:

dWk = dZkATk 1/length (Y ) (6)

and

dBk = sum (dZk, 2) /length (Y ). (7)

If we are at the first layer (k = 1), replace the previous layer’s activation in (6)

with the input X. 3

ENGR 3H Midterm

nn update.m

This function should take three inputs: the cell array containing the current

values for the weights and biases, the cell array containing the gradients, and

the learning rate. It should return a new cell array with the values updated via

the learning rate, for example:

Wnew = Wold + α ∗ dWold. (8)

nn evaluate.m

This function should take as inputs two row vectors, representing the predictions

and the actual outputs. It should return a number representing the accuracy

of the model. For training, we are using mean squared error, as defined above.

Your function should return the MSE (note that it should be a single, averaged

value).

nn main.m

This script will run through everything we just did, then save the resulting

cell array with the trained parameters. Your script should set a learning rate,

the number of epochs (iterations) to train for, and carry out the training loop.

Additionally, you should normalize your training data. After loading in the

data, find the mean µX and standard deviation σX of each input variable.

Then, subtract the mean and divide by the standard deviation:

Xnorm = X X µX σX . (9)

This causes the input to be centered at 0 and relatively evenly distributed to

either side of 0, which is beneficial for training. Note that when you get data that

you haven’t trained on, you will need to carry out the identical normalization,

so you should also store the values of µX and σX. 4

联系我们

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

热点文章

更多

讲解 program、辅导 java/pyth... 2025-07-12
辅导 mec202 - industrial awa... 2025-07-12
辅导 hse204 motor learning a... 2025-07-12
讲解 bem1024 – statistics f... 2025-07-12
讲解 hse202 biomechanics tri... 2025-07-12
讲解 engf0001 challenge 1 - ... 2025-07-12
辅导 game 336 applied princi... 2025-07-12
讲解 stats 779 professional ... 2025-07-12
讲解 fins5530 – financial i... 2025-07-12
讲解 tele 9753 advanced wire... 2025-07-12
讲解 tele 9753 advanced wire... 2025-07-12
辅导 psc 120 introduction to... 2025-07-12
mgt202辅导、讲解 java/pytho... 2025-06-28
讲解 pbt205—project-based l... 2025-06-28
辅导 comp3702 artificial int... 2025-06-28
辅导 cs3214 fall 2022 projec... 2025-06-28
辅导 turnitin assignment讲解... 2025-06-28
辅导 finite element modellin... 2025-06-28
讲解 stat3600 linear statist... 2025-06-28
辅导 problem set #3讲解 matl... 2025-06-28

热点标签

engn4536/engn6536

comp(2041|9044)

litr1-uc6201.200

int2067/int5051

csci-ua.0480-003

cs247—assignment

msinm014/msing014/msing014b

联系我们 - QQ: 99515681 微信：codinghelp

© 2024 www.7daixie.com

程序辅导网！