首页 >
> 详细

Computing Science Department

CMPUT 366 - Fall 2021

Problem Set

Unit 2 - Learners

Intelligent Systems (CMPUT 366)

1. Consider the trees below representing zero-sum two-player games. Write the values of x that allow

Minimax search with Alpha Beta pruning to prune the dashed nodes. You should write the word

‘None’ if no value of x allows the nodes to be pruned. You should assume that the search algorithm

visits the children from left to right.

max

min

max

5 8 5 9 5 x 10 20

max

min

max

8 5 7

2 1

x

max

min

max

x 5 5 7

2 1

2. The tree below represents a zero-sum two-player game where the green nodes (root of the tree) are

max’s nodes and yellow nodes are min’s nodes. Assuming a left-to-right ordering of visit, prune the

branches that Alpha Beta pruning would prune. This example created by Ariel Felner.

3. Consider the following training set of a regression problem, where given a value of x one wants to

predict the value of y with the hypothesis hw(x) = wx + b.

1

x y

1 1

2 2

3 3

a) Given that w = b = 0 and α = 0.1, what are the values of w and b after the first iteration of

gradient descent?

b) For α = 0.1 the algorithms takes approximately 300 iterations to converge to values of w and b

for which further gradient updates will not decrease the Mean Squared Error of the model. What

happens if we reduce the value of α to 0.01. Do you think it will converge with more or fewer

iterations than when α = 0.1? What is the difference between the solution encountered with

α = 0.1 and with α = 0.01?

c) Linear regression learns a linear function of the input values x. Explain two ways we could learn

a non-linear function of the input values.

4. Consider the classification problem given by the training data below. In this problem we are trying to

classify images of dogs while considering only three binary features: ears, fur, and hat. The column y

gives the labels: 0 for an image without a dob and 1 for an image with a dog. For example, the first

image contains no ears, fur, nor hat and it is labelled as ‘not dog,’ while the last image in the data set

contains ears, fur, and hat and it is labelled as ‘dog’. Considering the training set below, answer the

following questions.

ears fur hat y

0 0 0 0

1 0 0 0

1 1 0 0

0 1 1 0

1 0 1 1

1 1 1 1

a) Is this problem linearly separable? Justify your answer.

b) What if we modify the training so that [1, 1, 0] has the label of 1 instead of 0 and [1, 1, 1] has the

label of 0 instead of 1, is the problem linearly separable?

c) How many hidden layers does a neural network need for solving the problem given by the data

set above where the instance [1, 1, 0] has the label of 1 instead of 0 and [1, 1, 1] has the label of 0

instead of 1.

5. Consider the XOR problem shown in the table below. One of your classmates is trying to train a

neural network to learn the mapping of x0 and x1 values to the labels in the training data.

Your classmate told you that their model is unable to learn the correct mapping from the input values

to the output values. Suspecting they were using a model with no hidden layers, you asked them to

2

x0 x1 ¬(x0 ⊕ x1)

0 0 1

0 1 0

1 0 0

1 1 1

draw the model they have implemented. Your classmate then draws the following model on the back of

a napkin and they mention that they initialize all weights of the model with the value of zero. Suddenly

you understand why their model is unable to learn the XOR function. Explain to your classmate why

the weights can’t be initialized with the same value.

+1

x0

x1

+1

a0

a1

hw

6. Jorge, the capybara, needs your help to compute the Q(s, a) values for the problem below. Jorge starts

in position (1, 1) (see the figure below). The states with a dashed squared and with the letter G have

only the action “Out” available. All other states have the actions up (U), down (D), left (L), and right

(R) available. Once Jorge applies the “Out” action it goes to a terminal state in the Markov decision

process.

The table shows the experiences Jorge has collected by following an -greedy policy. Assuming that

Q(s, a) = 0 for all s and a, γ = 1 and α = 0.5, what are the values of Q(s, a) after processing the

experience shown in the table below with Q-Learning.

7. Sarsa is an on-policy algorithm that employs an -greedy policy to gather experience by interacting with

the environment. Sarsa then learns an optimal -greedy policy by interacting with the environment.

If the value of decreases over time and eventually converges to = 0, then Sarsa is able to learn

3

Episode 1 Episode 2 Episode 3 Episode 4

(1, 1), D, (1, 2), 0 (1, 1), D, (1, 2), 0 (1, 1), D, (1, 2), 0 (1, 1), D, (1, 2), 0

(1, 2), R, (2, 2), 0 (1, 2), R, (2, 2), 0 (1, 2), R, (2, 2), 0 (1, 2), R, (2, 2), 0

(2, 2), U, (2, 1), 0 (2, 2), R, (3, 2), 0 (2, 2), U, (2, 1), 0 (2, 2), R, (3, 2), 0

(2, 1), Out, Terminal, -10 (3, 2), R, (4, 2), 0 (2, 1), Out, Terminal, -10 (3, 2), R, (4, 2), 0

(4, 2), U, (4, 1), 0 (4, 2), U, (4, 1), 0

(4, 1), Out, Terminal, +50 (4, 1), Out, Terminal, +50

an optimal policy for the problem. Q-Learning also uses an -greedy policy to gather experience by

interacting with the environment. However, instead of learning an -greedy policy, Q-Learning learns

a greedy policy directly. What are the negative effects if Q-Learning also employed a greedy policy to

drive its behavior while interacting with the environment?

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-21:00
- 微信：codinghelp

- 代写 Lab 2: Threads 2022-05-10
- 辅导assessment 1. Present Your Client ... 2022-05-10
- 5Cce2sas辅导、Python，Java程序辅导 2022-05-10
- 代写brae Webb编程 2022-05-09
- 辅导csci 3110 Assignment 1 2022-05-09
- Mth2222 Assignment 2代写 2022-05-09
- Cse3bdc Assignment 2022辅导 2022-05-08
- 辅导cis 468、辅导java，Python编程 2022-05-08
- Comp Sci 4094/4194/7094 Assignment 3 D... 2022-05-07
- Cs 178: Machine Learning & Data Mining... 2022-05-07
- Data7703 Assignment 4 2022-05-07
- 讲解assignment 2: Databases 2022-04-25
- 辅导ait681 Static Analysis 2022-04-25
- Cse121 & Cse121l 编程辅导、辅导c++程序语言 2022-04-25
- 辅导iti1120 Bject-Oriented Programming 2022-04-25
- Cmt304语言辅导、辅导c++，Python编程 2022-04-25
- 辅导comp/Engn4528 Computer Vision 2022-04-24
- 辅导fin 2200 Bloomberg Investment Proj... 2022-04-24
- 辅导bism 7255 Uml Assignment 2022-04-23
- 讲解comp202 Programming Assignment 2022-04-23