THE UNIVERSITY OF SUSSEX
BSc and MComp SECOND YEAR EXAMINATION January 2022 (A1)
Program Analysis Assessment Period: January 2022 (A1)
DO NOT TURN OVER UNTIL INSTRUCTED TO BY THE LEAD INVIGILATOR
Candidates should answer TWO questions out of THREE.
If all three questions are attempted only the first two answers will be marked.
The time allowed is TWO hours. Each question is worth 50 marks.
At the end of the examination the question paper and/or answer book, used or unused, will be collected from you before you leave the examination room.
G6017 Program Analysis
1.
(a) Precisely specify the conditions under which the following algorithm returns
true, and then discuss, in detail, the running time of the algorithm. If you think it has different best- and worst-case running times then these should be considered separately, and you should explain the conditions under which best and worst-cases arise.
You must fully explain your answer and use O, and appropriately to receive full marks.
Algorithm Ex1 ((𝑎1, ... 𝑎𝑛), (𝑏1, ... , 𝑏𝑚))
𝑘←0
for 𝑖 ← 1 to 𝑛 do
𝑗←1
while 𝑗 ≤ 𝑚 do
If𝑎 ==𝑏 𝑖𝑗
𝑘←𝑘+1 𝑗←𝑗+1
return 𝑘 > 0
(b) Precisely specify the conditions under which the following algorithm returns true, and then discuss, in detail, the running time of the algorithm. If you think it has different best- and worst-case running times then these should be considered separately, and you should explain the conditions under which best and worst-cases arise.
You must fully explain your answer and use O, and appropriately to receive full marks.
[10 marks]
Algorithm Ex2 ((𝑎1, ... 𝑎𝑛), (𝑏1, ... , 𝑏𝑛)) 𝑞 ← 𝑡𝑟𝑢𝑒
for𝑖←1to𝑛 do 𝑗←𝑛
while𝑗>0and𝑞 ==𝑡𝑟𝑢𝑒do If𝑎 ==𝑏
𝑗←𝑗−1 return 𝑞
𝑖𝑗
𝑞 ← 𝑓𝑎𝑙𝑠𝑒
2
[10 marks]
(c) Precisely specify the conditions under which the following algorithm returns true, and then discuss, in detail, the running time of the algorithm. If you think it has different best- and worst-case running times then these should be considered separately, and you should explain the conditions under which best and worst-cases arise.
You must fully explain your answer and use O, and appropriately to receive full marks.
Algorithm Ex3 ((𝑎1, ... 𝑎𝑛), 𝑏)
𝑞←0
𝑤 ← 𝑓𝑎𝑙𝑠𝑒
for 𝑖 ← 1 to 𝑛 − 1 do
𝑧 ← 𝑎𝑖 + 𝑎𝑖+1 If 𝑧 < 𝑏
𝑞←𝑞+1
else
If 𝑞 < 0 return 𝑤
𝑞←𝑞−1 𝑤 ← 𝑡𝑟𝑢𝑒
(d) A data pattern analyser is to be built that can detect and count up the number of occurrences of two and three letter same letter sequences in a sequence (e.g. (a,a) or (b,b,b) ). The analyser should stop if it encounters * in the sequence and return the number of occurrences found up to that point in the form of a 2-tuple (#2𝐿𝑒𝑡𝑡𝑒𝑟𝑆𝑒𝑞𝑢𝑒𝑛𝑐𝑒𝑠, #3𝐿𝑒𝑡𝑡𝑒𝑟𝑆𝑒𝑞𝑢𝑒𝑛𝑐𝑒𝑠). A 3 letter sequence should only count as a 3 letter sequence, not an occurrence of two 2 letter sequences. No letter in the input sequence ever occurs more than 3 times in a row.
So, for example:
Input string (a,b,a,b,b,a,a,*)
(x,x,x,y,y) (p,q,p,z,z,*) (*,a,a)
(x,x,a,d,*, s,s,s,l)
2 letter sequences found
2 1 1 0 1
3 letter sequences found
0 1 0 0 0
[10 marks]
3
Produce a formal statement of this problem, and then write an algorithm to solve the problem using a pseudo code style similar to the one shown in parts (a) to (c). State the bounds on the best and worst case performance of your algorithms using O, and appropriately to receive full marks.
[10 marks]
(e) A file is protected by a random password consisting of 𝑛 binary bits. All password combinations are equally probable. To access the file we need the correct password. The process of applying the password to the file takes 10ms regardless of the value of 𝑛. Brute force attack is always a viable basic strategy for guessing a password.
To ensure that the file remains sufficiently secure, we need to ensure that there is no more than a 1% chance over 30 days that the password is guessed by a hacker program utilizing brute force working 24 hours a day, 7 days a week. How many bits should be specified for the password?
2.
(a) A student has been asked to put some parcels on a shelf. The parcels all weigh different amounts, and the shelf has a maximum safe loading weight capacity of 100 Kg. The weight of parcels are as follows (in Kg):
𝒑𝒂𝒓𝒄𝒆𝒍 𝒘𝒆𝒊𝒈𝒉𝒕 (𝑲𝒈)
18 2 50 32 4 15 54 65 7 20
The student has been asked to load the maximum weight possible parcels on the shelf subject to the maximum safe loading weight.
State two possible approaches for a greedy algorithm solution to solve this problem. In each case, state clearly the result you would get from applying that approach to this problem, stating whether the solution is optimal or not. If
[10 marks]
4
your answer does not produce an optimal solution, what algorithm could be employed to find one?
[10 marks]
(b) One example of a greedy algorithm is the Dijkstra algorithm for finding the lowest cost path through a weighted graph. The diagram below shows two weighted graphs that a student wants to investigate using Dijkstra’s algorithm. In each case the task it to find the lowest cost of reaching every node from v1. Each graph has a single negative weight in it.
Graph (a) Graph (b)
One of the graphs will yield a correct analysis of the lowest cost for all vertices, and the other will produce an incorrect analysis. Which of the two graphs will produce the incorrect analysis, and explain why the greedy nature of Dijkstra’s algorithm is responsible for the incorrect analysis. Your answer should include the key concept of an invariant.
[5 marks]
(c) The priority queue is a widely used data structure. Priority queues may be implemented using binary heaps and simple linear arrays. For the basic priority queue operations of:
• Building an initial queue
• Taking the highest priority item off the queue
• Adding a new item to the queue
Compare and contrast the running time complexities (best and worst cases) associated with implementations using binary heaps and simple linear arrays. You may find it helpful to use diagrams to support your answer.
[10 marks]
5
(d) A recursive algorithm is applied to some data 𝐴 = (𝑎1, ... , 𝑎𝑚) where 𝑚 ≥ 2 and 𝑚 is even. The running time 𝑇 is characterised using the following recurrence equations:
𝑇(2) = 𝑐 when the size of 𝐴 is 2 𝑇(𝑚) = 𝑇(𝑚 − 2) + 𝑐 otherwise
Determine the running time complexity of this algorithm. Note that 𝑚 is even and the problem size reduces by 2 for each recursion.
[10 marks]
(e) Another recursive algorithm is applied to some data 𝐴 = (𝑎1, ... , 𝑎𝑚) where 𝑚 = 2𝑥 (i.e. 2, 4, 8,16 ...) where 𝑥 is an integer ≥ 1. The running time T is characterised using the following recurrence equations:
𝑇(1) = 𝑐 when the size of 𝐴 is 1, and 𝑐 is a constant 𝑇(𝑚) = 2𝑇 (𝑚) + 𝑚 otherwise
2
Determine the running time complexity of this algorithm. You will find it helpful to recall that:
𝑥=∞ 1
∑2𝑘 →1 𝑥=1
And
2𝑙𝑜𝑔2(𝑥) = 𝑥
3.
[15 marks]
(a) The subset sum problem can be reliably solved optimally using the dynamic programming algorithm shown below:
SubsetSum(𝑛, 𝑊)
Let 𝐵(0,𝑤) = 0 for each 𝑤 ∈ {0,...,𝑊} for 𝑖 ← 1 𝑡𝑜 𝑛
for 𝑤 ← 0 𝑡𝑜 𝑊 if𝑤<𝑤𝑖 then
𝐵(𝑖, 𝑤) ← 𝐵(𝑖 − 1, 𝑤)
else
𝐵(𝑖,𝑤)←max(𝑤𝑖 +𝐵(𝑖−1,𝑤−𝑤𝑖),𝐵(𝑖−1,𝑤))
6
Where 𝑛 is the number of requests, 𝑊 is the maximum weight constraint, 𝑤𝑖 is the weight associated with request 𝑖, and 𝐵 is the solution space.
You are given a set of requests and their corresponding weights. The maximum weight constraint 𝑊 is 12.
𝒊 𝒘𝒊
11 27 3 10 46 53 62
Copy the following solution space table to your answer book (do not write your answer on the question paper) and complete the table to determine the optimal subset sum.
𝒊
6 5 4 3 2 1 0
𝒘
0 1 2 3 4 5 6 7 8 9 10 11 12
[10 marks]
(b) The sequence alignment problem may be solved by the following dynamic programming algorithm:
SequenceAlignment(X, Y):
Let 𝐵(𝑖, 0) ← 𝑖 × 𝛾 for each 1 ≤ 𝑖 ≤ 𝑛 Let 𝐵(0, 𝑗) ← 𝑗 × 𝛾 for each 1 ≤ 𝑗 ≤ 𝑚 For 𝑖 ← 1 𝑡𝑜 𝑛
For 𝑗 ← 1 𝑡𝑜 𝑚
𝐵(𝑖, 𝑗) ← min [𝛿(𝑥 , 𝑦 ) + 𝐵(𝑖 − 1, 𝑗 − 1),
𝑖𝑗
𝛾 + 𝐵(𝑖, 𝑗 − 1),
𝛾 + 𝐵(𝑖 − 1), 𝑗 ]
7
Where 𝑋 = (𝑥 ,...,𝑥 ) and 𝑌 = (𝑦 ,...,𝑦 ) are two sequences to be aligned, 𝛿(𝑝,𝑞) 1𝑛 1𝑚
is a penalty associated with matching symbol 𝑝 to 𝑞, and 𝛾 is a gap penalty. The sequence alignment algorithm is applied using the following data:
𝑋 =(𝑎,𝑏,𝑐)
𝑌 = (𝑎,𝑏,𝑎,𝑏,𝑏)
𝛾=4
The delta function is defined for symbols in the alphabet {𝑎, 𝑏, 𝑐} :
aBc a045 b407 c570
Generate the problem space matrix 𝐵 and thus determine the optimal alignment between 𝑋 and 𝑌.
[15 marks] (c) Draw a Minimum Spanning Tree derived from the graph shown below.
[5 marks]
(d) The Ford-Fulkerson algorithm is used to determine network flow. The diagram below represents a data network that connects a Data Service Provider (DSP)
8
connected to 𝑣1(𝑠) to a customer connected to 𝑣6(𝑡). Each edge represents a single data transmission link.
The notation 𝑝/𝑞 indicates a current actual forwards flow 𝑝 measured in Gb/s in a pipe with a maximum capacity of 𝑞 also measured in Gb/s.
At the outset no data is being sent by the DSP to the customer.
i. Show the residual graph that will be created from the initial empty flow. When drawing the residual graph, show a forward edge with capacity 𝑥 and a backward edge with flow 𝑦 by annotating the edge 𝑥⃗; 𝑦⃖ .
[2 marks]
ii. What is the bottleneck edge of the path (𝑠, 𝑣3, 𝑣4, 𝑡) in the residual graph you have given in answer to part (a) ?
[2 marks]
iii. Show the residual graph after incorporating the simple path (𝑠, 𝑣3, 𝑣4, 𝑡) that results from augmenting the flow based on the residual graph you have given in answer to part (a).
[4 marks]
iv. Repeat the process outlined above incorporating additionally the simple paths (𝑠, 𝑣3, 𝑣2, 𝑣5, 𝑡) , (𝑠, 𝑣2, 𝑣5, 𝑣4, 𝑡) and (𝑠, 𝑣2, 𝑣5, 𝑡) showing each residual graph, to determine the maximum flow between 𝑠 and 𝑡, and thus the maximum data bandwidth that can be achieved between the DSP and the customer.