首页 > > 详细

MSBD5015 Assignment #2

 MSBD5015 2020 Fall Semester Assignment #2 - The Written Part

Date assigned: Wednesday, Oct 21.
Due time: 23:59 on Tuesday, Nov 3.
How to submit it: Submit your written answers as a pdf file on canvas.ust.hk.
Penalties on late papers: 20% off each day (anytime after the due time is considered
late by one day)
Problem 1 (10%). Given a graph and a node in it as the starting point, the traveling
salesman problem (TSP) is about finding a least cost path that starts and ends at the
starting node, and goes through each other node in the graph once and exactly once. The
edges of the graph are labeled by a number representing the cost of traveling between the
connecting two nodes. Formulate this problem as a search problem by specifying the states,
possible initial states, goal test, operators, and operator costs.
Problem 2. (10%) (Adapted from Russell and Norvig) Consider the problem of con￾structing crossword puzzles: fitting words into a grid of intersecting rows and columns of
squares. Assume that a list of words (i.e. dictionary) is provided, and that the task is to
fill in the rows and columns with words from this list so that if a row intersects with a
column, their intersecting square has the same letter. Formulate this problem as an assign￾ment (constraint satisfaction) problem by specifying the variables, their domains of possible
values, and the constraints.
Problem 3. (10%) Consider the following two investment decisions: buy a certified
deposit (CD) which will pay you a 10% annual interest rate, and buy some stocks which
will either give you a 30% annual return (with 0.7 probability) or incur a 10% loss for
you (with 0.3 probability). In terms of expected utility, you should of course buy stocks.
However, suppose you are conservative and cannot tolerate any loss of your principal. In
this case, you have no choice but to deposit your money into a CD. Now consider this
problem for a 3 years time span. You start with 1 unit of money. Each year you can choose
only one way to invest your entire fund (meaning you are not allowed to diversify, like 50%
CD, 50% stocks). You definitely don’t want to lose any money at the end of the 3 years,
but you are okay with a “temporary” loss during the time (for example, it is okay for you
to lose some money in the first year as long as you can make it up later). Formalize this
problem as a Markov decision process and compute its optimal policy.
Hints A state needs to contain information about how many money you have in this state.
In other words, a state needs to encode the history: which actions you have done so far and
what their outcomes are if the actions are non-deterministic.
Problem 4. (15%) Recall our boundary following robot. The correct behavior is that the
robot should follow a boundary in a consistent way, clockwise if it’s inside the wall of the
room, and counter-clockwise if it is outside an obstacle in the room.
1
A
B C
D E F G H
JK L
M N
3 5 4 I
0 7
578
MAX
MIN
4.1 Formulate this task as a MDP.
4.2 Formulate this task as a reinforcement learning problem.
Problem 5. (15%) Consider the following search problem:
• state space: {S, A, B, G}; • operators: O1 maps S to A with cost 4, O2 maps S to B with cost 2, O3 maps B to
A with cost 1, and O4 maps from A to G with cost 5;
• initial state S; • goal state G; • heuristic function h: h(S) = 7, h(A) = 1, h(B) = 6, and h(G) = 0.
Find a solution using A∗
search by tree. Number the nodes according to their order of
expansion, and for each node give its f(n) = g(n) + h(n) value.
Problem 6. (10%) Perform minimax with perfect decision on the tree given in the
following figure (the leaves are terminal nodes, and the numbers next to them are their
utility values).
Figure 1: A minimax search tree
Problem 7. (10%) Perform a left-to-right alpha-beta prune on the tree in the above
exercise. Perform a right-to-left prune on the same tree. Left-to-right or right-to-left means
the order by which the leaf nodes are generated. So for left-to-right, the first leaf node
considered is D, followed by E, followed by M and so on.
2
联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!