CS 170, Spring 2020 HW 10 A. Chiesa  J. Nelson 
CS 170 HW 10 
Due 2019-04-13, at 10:00 pm 
1 Study Group 
List the names and SIDs of the members in your study group. If you have no collaborators, 
you must explicitly write none. 
2 Opting for releasing your solutions 
We are considering releasing a subset of homework submissions written by students for stu- 
dents to see what a full score submission looks like. If your homework solutions are well 
written, we may consider releasing your solution. If you wish that your solutions not be 
released, please respond to this question with a ”No, do not release any submission to any 
problems”. Otherwise, say ”Yes, you may release any of my submissions to any problems”. 
3 Zero-Sum Games 
Alice and Bob are playing a zero-sum game whose payoff matrix is shown below. The ijth 
entry of the matrix shows the payoff that Alice receives if she plays strategy i and Bob plays 
strategy j. Alice is the row player and is trying to maximize her payoff, and Bob is the 
column player trying to minimize Alice’s payoff. 
Alice \Bob 1 2 
1 4 1 
2 2 5 
Now we will write a linear program to find a strategy that maximizes Alice’s payoff. Let the 
variables of the linear program be x1, x2 and p, where xi is the probability that Alice plays 
strategy i and p denotes Alice’s payoff. 
(a) Write the linear program for maximizing Alice’s payoff. (Hint : You should think of set- 
ting up the constraints of the program such that it finds the best worst case strategy. This 
would depend on the strategy Bob plays assuming he knows what Alice’s (probabilistic) 
strategy is.) 
(b) Eliminate x2 from the linear program and write it in terms of p and x1 alone. 
(c) Draw the feasible region of the above linear program in p and x1. You are encouraged to 
use a plotting software for this. 
(d) Write a linear program from Bob’s perspective trying to minimizing Alice’s payoff. Let 
the variables of the linear program be y1, y2 and p, where yi is the probability that Bob 
plays strategy i and p denotes Alice’s payoff. 
(e) What is the optimal solution and what is the value of the game? 
1 
CS 170, Spring 2020 HW 10 A. Chiesa  J. Nelson 
4 Zero-Sum Battle 
Two Pokemon trainers are about to engage in battle! Each trainer has 3 Pokemon, each of 
a single, unique type. They each must choose which Pokemon to send out first. Of course 
each trainer’s advantage in battle depends not only on their own Pokemon, but on which 
Pokemon their opponent sends out. 
The table below indicates the competitive advantage (payoff) Trainer A would gain (and 
Trainer B would lose). For example, if Trainer B chooses the fire Pokemon and Trainer A 
chooses the rock Pokemon, Trainer A would have payoff 2. 
Trainer B: 
ice water fire 
dragon -10 3 3 
Trainer A: steel 4 -1 -3 
rock 6 -9 2 
Feel free to use an online LP solver to solve your LPs in this problem. 
Here is an example of a Python LP Solver and its Tutorial. 
1. Write an LP to find the optimal strategy for Trainer A. What is the optimal strategy 
and expected payoff? 
2. Now do the same for Trainer B. What is the optimal strategy and expected payoff? 
5 How to Gamble With Little Regret 
Suppose that you are gambling at a casino. Every day you play at a slot machine, and your 
goal is to minimize your losses. We model this as the experts problem. Every day you must 
take the advice of one of n experts (i.e. a slot machine). At the end of each day t, if you take 
advice from expert i, the advice costs you some cti in [0, 1]. You want to minimize the regret 
R, defined as: 
R = 
1 
T 
( 
T∑ 
t=1 
cti(t) − min1≤i≤n 
T∑ 
t=1 
cti 
) 
where i(t) is the expert you choose on day t. Your strategy will be probabilities where pti 
denotes the probability with which you choose expert i on day t. Assume an all powerful 
adversary (i.e. the casino) can look at your strategy ahead of time and decide the costs 
associated with each expert on each day. Let C denote the set of costs for all experts and all 
days. Compute maxC(E[R]), or the maximum possible (expected) regret that the adversary 
can guarantee, for each of the following strategies. 
(a) Choose expert 1 at every step, i.e. pt1 = 1 for all t. 
(b) Any deterministic strategy, i.e. for each t, there exists some i such that pti = 1. 
(c) Always choose an expert according to some fixed probability distribution at every time 
step. That is, fix some p1, . . . , pn, and for all t, set p 
t 
i = pi. 
What distribution minimizes the regret of this strategy? In other words, what is 
argminp1,...,pn maxC(E[R])? 
2 
CS 170, Spring 2020 HW 10 A. Chiesa  J. Nelson 
This analysis should conclude that a good strategy for the problem must necessarily be 
randomized and adaptive. 
6 Solving Linear Programs using Multiplicative Weights 
In this problem, we will develop an algorithm to approximately solve linear programs using 
multiplicative weights. For simplicity, we will restrict our attention to a specific kind of linear 
programs given as follows: 
Given vectors aj for j = 1, . . . ,m in Rn, the linear program on variables x = (x1, . . . , xn) is 
given by: 
max(0) 
for all j = 1, . . . ,m : 〈aj ,x〉 ≤ 0 
n∑ 
i=1 
xi = 1 
for all i = 1, . . . , n : xi ≥ 0 
Here 〈x,y〉 denotes the dot product between vectors, specifically 〈x,y〉 = 
n∑ 
i=1 
xi · yi where 
xi, yi refers to the i 
th components of the vector x,y. 
We will now use multiplicative weights update towards solving our linear program. The idea 
would be to execute multiplicative weights update against an appropriate adversary (sequence 
of losses) and let the weights of the algorithm determine the solution x. Formally, the rough 
outline of our algorithm to solve the above LP is as follows: 
For t = 1 to a sufficiently large number T 
1. Multiplicative weights update will return a current probability 
distribution x(t) 
2. If x(t) approximately satisfies the LP, return x(t). 
3. Adversary A will present a loss vector `(t) ∈ Rn. 
4. Multiplicative weights algorithm will incur a loss of 〈`(t),x(t)〉 and 
update its weights. 
Return x(T ) 
We will design an algorithm that will act as the adversary A, so that x(T ) is an approximate 
solution to the LP. 
Recall that the multiplicative weights algorithm obeys the following regret bound: 
3 
CS 170, Spring 2020 HW 10 A. Chiesa  J. Nelson 
Theorem. The multiplicative weights algorithm starts with an iterate x(1) ∈ Rn, and 
then suffers a sequence of loss vectors `(1), . . . , `(T ) ∈ Rn such that `(t)i ∈ [−1, 1]. It then 
produces a sequence of iterates x(2), . . . ,x(T ) such that for some 0 <  ≤ 12 , 
T∑ 
t=1 
〈`(t),x(t)〉 ≤ min 
1≤i≤n 
( 
T∑ 
t=1 
` 
(t) 
i 
) 
+ 2 
( 
T + 
log(n) 
) 
(a) Let p = (p1, . . . , pn) be a probability distribution over {1, . . . , n}, i.e., pi ≥ 0 and∑n 
i=1 pi = 1. For any vector v ∈ Rn prove that, 
n 
min 
i=1 
vi ≤ 
n∑ 
i=1 
pivi 
(In words, the minimum is smaller than the average under every distribution) 
(b) We will now observe an important property of multiplicative weights. The algorithm not 
only has small regret against any fixed best expert, but also has small regret against any 
fixed probability distribution over experts. 
Let p∗ = (p∗1, . . . , p∗n) be a probability distribution on {1, . . . , n}, i.e. p∗i ≥ 0 for each 
1 ≤ i ≤ n and ∑ni=1 p∗i = 1. Using the bounds in the regret of the multiplicative weights 
algorithm mentioned above, prove that 
T∑ 
t=1 
( 
〈`(t),x(t)〉 − 〈`(t),p∗〉 
) 
≤ 2 
( log(n) 
+ T 
) 
(c) At the tth iteration, the probability distribution x(t) output by multiplicative weights 
update is a candidate solution to our LP. But it is likely to violate some of the constraints 
of the LP, namely 〈ai,x(t)〉 ≤ 0. 
Describe how to implement the adversaryA to nudge the multiplicative weights algorithm 
towards a feasible solution to the LP. 
Hint : What loss vector should the adversary A pick in order to penalize the multiplicative 
weights algorithm the most for violations? 
(d) Let us call a solution x to be η-approximate solution to the LP if it satisfies all the 
constraints within an error of η where η ≤ 2, i.e., 
〈ai,x〉 ≤ η 
. 
4 
CS 170, Spring 2020 HW 10 A. Chiesa  J. Nelson 
Assume that the LP is feasible in that there exists some probability distribution x∗ that 
satisfies all the constraints. For simplicity, let us assume that all the coefficients in our 
LP, |(aj)i| ≤ 1. Show that if multiplicative weights uses  = η4 then after T = 16 log(n)η2 
iterations, the distribution x(T ) is an η-approximate solution. 
Hint : In every iteration t, if the current distribution x(t) violates a constraint by more 
than η, then it accumlates regret for not being the feasible solution x∗ 
7 Restoring the Balance! 
We are given a network G = (V,E) whose edges have integer capacities ce, and a maximum 
flow f from node s to node t. Explicitly, f is given to us in the representation of integer flows 
along every edge e, (fe). 
However, we find out that one of the capacity values of G was wrong: for edge (u, v), we used 
cuv whereas it really should have been cuv − 1. This is unfortunate because the flow f uses 
that particular edge at full capacity: fuv = cuv. We could run Ford Fulkerson from scratch, 
but there’s a faster way. 
Describe an algorithm to fix the max-flow for this network in O(|V |+ |E|) time. Also give a 
proof of correctness and runtime justification.