CSCC63 Assignment 2 
Induction and Recursive Definitions 
Due 11:59pm, March 6 
Warning: Your electronic submission of a PDF to Crowdmark (through Quercus) affirms that this as- 
signment is your own work and no one else’s, and is in accordance with the University of Toronto Code of 
Behaviour on Academic Matters, the Code of Student Conduct, and the guidelines for avoiding plagiarism 
in CSCC63. Note that using Google or any other online resource is considered plagiarism. 
1. (10 marks) Let INFINITE = 
{〈M〉 ∣∣ |L(M)| =∞}. Show that INFINITE is neither recognizable nor 
co-recognizable by proving that ALLTM 6mINFINITE. 
Solution: 
Let w1, w2, . . . be an effective enumeration of Σ 
∗. 
Let P = “On input 〈M〉: 
1. Let M0 = “On input 〈x〉: 
1. For wi = w1 to w|x|: 
2. Run M on wi. 
3. Accept.” 
2. Return 〈M0〉.” 
Proof that 〈M〉 ∈ ALLTM iff 〈M0〉 ∈INFINITE: 
⇒) Suppose that 〈M〉 ∈ ALLTM : 
⇒ M halts on every input. 
⇒ M halts on the strings w1 to wk for every k ∈ N. 
⇒ M halts on the strings w1 to w|x| for every input 〈x〉 to M0. 
⇒ M0 accepts every input. 
⇒ |L(M0)| =∞. 
⇒ 〈M0〉 ∈INFINITE. 
⇐) Suppose that 〈M〉 /∈ ALLTM : 
⇒ There is some wk on which M loops. 
⇒ There is some input x0 to M0 such that |x0| = k. M will then loop on the input w|x0|. 
⇒ For all inputs x longer than x0, M0 will eventually try to run M on wx0 in line 2, and will 
loop. 
⇒ |L(M0)| < |x0|Σ|x0| 
(This is an upper bound on the number of strings of length x0 — the extra factor is a quick 
way of dealing with the possibility of a unary alphabet.) 
⇒ |L(M0)| <∞. 
⇒ 〈M0〉 /∈INFINITE. 
1 
2. (20 marks) Let L = 
{〈M〉 ∣∣∃n ∈ N,M does not return n2 when run on 〈n〉}. Determine whether L is 
decidable, recognizable, co-recognizable, or neither. Prove your answer without using Rice’s theorem. 
Solution: 
L is neither recognizable nor co-recognizable. We have two proofs below. 
The easy way: 
We’ll reduce from ALLTM . Since ALLTM is neither recognizable nor co-recognizable, this 
will tell us immediately the L is also neither recognizable nor co-recognizable. 
Let P = “On input 〈M〉: 
1. Let M0 = “On input 〈n〉: 
1. Run M on 〈n〉. 
2. Return n2.” 
2. Return 〈M0〉.” 
Proof that 〈M〉 ∈ ALLTM iff 〈M0〉 ∈ L: 
⇐) Suppose that 〈M〉 /∈ ALLTM : 
⇒ M halts on every input. 
⇒ For every n ∈ N, M halts on 〈n〉. 
⇒ For every n ∈ N, M0 reaches step 2 and outputs n2. 
⇒ 〈M0〉 /∈ L. 
⇒) Suppose that 〈M〉 ∈ ALLTM : 
⇒ There is some n0 on which M loops. 
⇒ For this n0, M0 will loop in step 1, and so will not reach step 2. 
⇒ For this n0, M0 will not output n20. 
⇒ 〈M0〉 ∈ L. 
The slightly longer way: 
We could also just repeat the reductions to ALLTM from class, with minor modifications: 
Proof that L is not recognizable (HALT 6m L): 
Let P = “On input 〈M,w〉: 
1. Let M0 = “On input 〈n〉: 
1. Run M on 〈w〉. 
2. Return n2.” 
2. Return 〈M0〉.” 
Proof that 〈M,w〉 ∈ HALT iff 〈M0〉 ∈ L: 
⇐) Suppose that 〈M,w〉 ∈ HALT: 
⇒ M halts on w. 
⇒ For every n ∈ N, M0 reaches step 2 and outputs n2. 
⇒ 〈M0〉 /∈ L. 
⇒) Suppose that 〈M〉 /∈ HALT: 
⇒ M loops on w. 
⇒ M0 will never reach step 2 for any input 〈n〉. 
⇒ M0 will not output n2 for any input 〈n〉. 
⇒ 〈M0〉 ∈ L. 
2 
Proof that L is not co-recognizable (HALT 6m L): 
Let P = “On input 〈M,w〉: 
1. Let M0 = “On input 〈n〉: 
1. Run M on 〈w〉 for n steps. 
2. If it halts loop, otherwise return n2.” 
2. Return 〈M0〉.” 
Proof that 〈M,w〉 ∈ HALT iff 〈M0〉 ∈ L: 
⇐) Suppose that 〈M,w〉 ∈ HALT: 
⇒ M loops on w. 
⇒ For every n ∈ N, M does not halt on 〈w〉 in n steps. 
⇒ For every n ∈ N, M0 reaches step 2 and outputs n2. 
⇒ 〈M0〉 /∈ L. 
⇒) Suppose that 〈M〉 /∈ HALT: 
⇒ M halts on w. 
⇒ There is some n0 such that M halts on w in n0 step. 
⇒ For this n0, M0 will loop in step 2. 
⇒ For this n0, M0 will not output n20. 
⇒ 〈M0〉 ∈ L. 
3. (10 marks) Suppose that, for two languages A and B, A 6m B. If A is recognizable and the mapping 
reduction is surjective, prove that B is also recognizable. 
Solution: 
Suppose that A 6 B, and that the mapping is surjective. 
Then, there is some computable function f such that ∀x, x ∈ A↔ f(x) ∈ B. 
Moreover, since the mapping is surjective, we know that for every y ∈ B, there is some x ∈ A 
such that f(x = y), and for every y /∈ B, there is some x /∈ A such that f(x = y)(*). 
Suppose further that A is recognizable. Then we know that the elements of A can be effectively 
enumerated. 
Let x1, x2, . . . be an effective enumeration of A. 
We can write the following recognizer for B: 
Let RB = “On input 〈y〉: 
1. For i = 1 to ∞: 
2. If f(xi) = y, accept.” 
If y ∈ B, then there is some x ∈ A such that f(x) = y, by (*). Since A is recognizable, there is 
some i ∈ N such that x = xi. Since f is computable, every loop iteration in RB will eventually 
halt, and so RB will eventually reach this xi and accept. 
If y /∈ B, then there is no x ∈ A such that f(x) = y, and so RB will never accept. 
4. (10 marks) Show the class NP is closed under the Kleene star operation. 
Solution: 
Suppose that we have some L ∈ NP. Then there is a polytime verifier V such that, for every 
x ∈ Σ∗, there is a polysize certificate Cx that V accepts iff x ∈ L. 
Note that our indices in questions 2 and 3 are for the breaks between letters, not for the letters 
themselves. That’s why they go from 0 to n, not n− 1. 
3 
To show L∗ ∈ NP, we say: 
1. This is a decision problem. 
2. If x ∈ L∗, then it is either empty, or it can be broken into substrings w1, . . . , wk such that 
every wi ∈ L. If it is empty, we can verify it quickly, so we just need a certificate for non-empty 
x. 
If x 6= ε, the certificate C is: 
• A set of indices showing where to break x into the substrings w1, . . . , wk. 
• A set of certificates Cw1 , . . . , Cwk for L such that Cwi shows that wi ∈ L. 
3. If |x| = n, then we can break x into at most n substrings. So we need to store at most n 
indices and L certificates. Since each Cwi is polynomial in size, each is of size O(|wi|r) for 
some r ∈ N. So the total size of the certificate is O(nr+1), and so the entire certificate is 
polysize. 
4. In order to verify the certificate, we use the following algorithm: 
Let V ′ = “On input 〈x,C〉: 
1. If x = ε, accept. 
2. Else: 
3. For i ∈ [1, . . . k]: 
4. Run V on 〈wi, Cwi〉. 
5. If it rejects, reject. 
6. Accept.” 
5. As we saw in the certificate, we only have to worry about k ≤ n, and so we have O(n) 
iterations of the loop. Since V is a polytime verifier, it runs in O(nr′) time for some r′ ∈ N. 
So each loop iteration takes O(|wi|r′) = O(nr′) time. The other operations are all constant 
time, so the entire program takes O(nr′+1) time. 
So L∗ ∈ NP. 
5. (10 marks) Show the class co-NP is closed under the Kleene star operation. 
Solution: 
Suppose that we have some L ∈co-NP. So we know that L ∈ NP, and we want to show that( 
L∗ 
) ∈NP. 
The language 
( 
L∗ 
) 
is the set of x such that, for all partitions of x into substrings w1, . . . , wk, at 
least one of the wi is in L. 
Since L ∈ NP, there is a polytime verifier V such that, for every x ∈ Σ∗, there is a polysize 
certificate Cx that V accepts iff x ∈ L. 
1. This is a decision problem. 
2. The certificate C will be: 
• A set S of pairs (i, j) of indices of x such that 0 ≤ i < j ≤ n for every i, j. 
• A set Cxi:j of certificates for every substring xi:j of x whose indices are in S. 
3. The certificate C consists of pairs of indices of x, along with a certificate. Since the certificate 
is polysize, there is some r ∈ N such that the size of any index pair and its associated 
certificate is O(nr). There are at most O(n2) possible substrings to choose from, and so the 
overall certificate size is at most O(nr+2), and so the certificate C is also polynomial in size. 
4. The important thing to see here is that we’re trying to show that, for all partitions of x into 
substrings w1, . . . , wk, at least one of the wi is in L. If this is not the case, then there is at 
least one partition of x into substrings such that each substring is in L. What we’ll do is try 
to build such a partition, and accept iff we can be convinced that we can’t do so. 
4 
Now, if such a partition exists, it must have an initial string x0:j for some j. This string 
cannot be in L, and so we can’t have a certificate Cx0:j for it. So we can remove from 
our considerations any string x0:j that we can prove through our certificates cannot be the 
first string in our partition. Once we’ve done so, there will be some (possibly empty) set of 
indices that we can reach from index 0 in our partition. We’ll go to the next reachable index 
down from 0 and apply the same principle – we should be able to find a next string in the 
sequence. Once we’re done this process, we’ll have a list of indices we can reach from index 0 
by using substrings that have not been proven to be in L. If this list contains the last possible 
breakpoint n, we can build a partition that contains strings not proven to be in L3, and so 
we reject. If we can’t reach index n, we know that there is no partition whose strings are all 
in L3, and so we accept. 
Note that our indices in questions 2 and 3 are for the breaks between letters, not for the letters 
themselves. That’s why they go from 0 to n, not n− 1. 
Let V ′ = “On input 〈x,C〉: 
1. If x = ε, reject. 
2. For (i, j) ∈ S: 
3. Run V on 〈xi:j , Cxi:j 〉. 
4. If it rejects, reject. 
5. Make a list [0, . . . , n] of indices. Mark index 0 as reachable, and the others as unreachable. 
6. Let i = 0. 
7. While i < n: 
8. For all j such that (i, j) /∈ S: 
9. Mark j as reachable. 
10. Let i be the next vertex marked as reachable. If no such vertex exists, set i = n. 
11. If n is marked as reachable, reject. Else, accept. 
5. Since there are at most O(n2) substrings considered, the loop from lines 2-4 runs at most 
O(n2) times. If the verifier V takes nr′ time to run, this loop takes O(nr′+2) time. Creating 
a list of indices in step 5. takes O(n) time. Finally, the loop from lines 7-10 runs at most n 
times. each iteration takes at most O(n2) time to loop through the indices in S, and another 
O(n) time to find the next reachable vertex. 
So the overall time takes is O(nmax (r′+2,3)) = O(nr′+2) time (since r′ ≥ 1 if we want V to be 
able to read the whole input). 
So L∗ ∈ co-NP. 
6. (25 marks) 
Let Root-Clique = 
{〈G = (V,E)〉 ∣∣G is an undirected graph with a clique of size at least √|V |}. 
(a) (5 marks) Show that Root-Clique ∈ NP. 
Solution: 
1) This is a language, so its decision problem is a yes/no question. 
2) A certificate c would be a set of nodes forming a clique of size at least 
√|V |. 
3) We can see that this certificate is a set of nodes in V , so |c| 6 |V |, and so this certificate 
is polynomial in the size of the input. 
5 
4) A verifier V for this language would be: 
V = “On input 〈G = {V,E}, c〉: 
1. Check that every node in c is in V , and occurs no more than once. 
2. Check that the |c| > k√|V |). 
3. For every u ∈ C: 
4. For every v 6= u ∈ C: 
5. Check that uv ∈ E. 
6. If all of these checks pass, accept, otherwise reject. 
5) If n is the number of nodes in the graph, then: 
• We can have a p that has up to n nodes, and in line 1 we will have to compare each of 
these nodes to each other (and to the nodes in V ). So line 1 takes O(n2) time. 
• Lines 2 checks the size of c, which takes O(n) time. 
• The loops run at most O(n2) times, and in a reasonable (say, adjacency matrix) repre- 
sentation, each iteration will take O(1) time. So the loop takes O(n2) time. 
• Line 6 takes O(1) time. 
• So altogether, the verifier takes O(n2) time. 
So the verifier runs in polytime. 
(b) (10 marks) Assuming that Clique is NP-complete, show that Root-Clique is NP-complete. 
Solution: 
We know from part a) that Root-Clique is in NP. So we just need to show that it is 
NP-hard. We’ll do this with a reduction from Clique. 
Let P = “On input 〈G = (V,E), k〉: 
1. Let G′ = G. 
2. Add |V |2 − |V | disconnected nodes to G′. 
3. Take the first |V | − k of these extra nodes and add fully connect them 
to form an (|V | − k)-clique. 
4. Add an edge from every node in this clique to every node from the original G. 
2. Return 〈G′〉.” 
Proof that 〈G, k〉 ∈Clique iff 〈G′〉 ∈Root-Clique: 
⇒) Suppose that G has a k-clique S. 
Let S′ be the new (|V | − k)-clique added in the construction of G′. 
Since every node in S′ is connected to every node from G, S∪S′ for an ((|V |−k)+k) = |V |- 
clique. 
Since 
√|V ′| = √|V |2 = |V |, S ∪ S′ is a root-clique of G′. 
So G′ ∈Root-Clique. 
⇐) Suppose that G′ has a root-clique S′. 
Then, since only |V | − k of the extra nodes are not disconnected, we know that at most 
|V | − k of the nodes in S′ areof this type. 
So at least k of the nodes in S′ nodes are in G. Let S = V ∩ S′. 
Since |S| > k and S is completely connected, S is a k-clique in G. 
So 〈G, k〉 ∈Clique. 
Note: another natural way to do this reduction would be to break it into two cases: one 
for when k < 
√|V |, and one for when k > √|V |. If you’ve done that, it’s fine. Both 
cases should look a little like what we’ve done above. 
6 
(c) (10 marks) Show how you could use an oracle MC for Root-Clique to find a certificate for it: 
that is, use MC to build a polytime program that, given an instance 〈G = (V,E)〉 of Root- 
Clique, will return a clique of size at least 
√|V | if such a clique exists, and will return null 
otherwise. 
Solution: 
Suppose that we had an oracle MC that could solve Root-Clique in polytime. Here is an 
algorithm to find a clique of size > 
√|V |: 
P = “On input 〈G = {V,E}〉: 
1. If |V | = 1, return V . 
2. Run MC on 〈G〉. 
3. If the answer if no, reject.” 
4. Let G′ = G. 
5. For v ∈ V : 
6. Let G′′ = G′ with the edges touching v removed. 
7. Run MC on 〈G′′〉. 
8. If it outputs yes, set G′ = G′′. 
9. Return the set of nodes in G′ that have non-zero degree. 
If G has one node, then its one node is also a root-clique, so we can safely return the entire 
vertex set. Otherwise, we can see that if G does not have a root-clique, P will return null. If 
it does have a root-clique, it will have at least one after every loop iteration. So G′ will have 
at least one root-clique when the loop terminates. If there is any node v that is not in this 
root clique, the loop would have removed in in lines 7 to 9 in the loop iteration in which v 
was considered. So the only connected nodes in G′ are those in the root-clique, and so the 
algorithm returns a root-clique, as desired. 
Suppose that MC runs in O(nk) time for some k ∈ N. Then, lines 1,3, and 4 all take O(n2) 
time (remember that we can have O(n2) edges in a graph), while line 2 takes O(nk) time. 
The loop runs one for every node in G, so it takes O(n) iterations. In these iterations, lines 6 
and 8 take O(n2) time, and line 7 takes O(nk) time. So all told, the loop takes O(n1+max (2,k)) 
time, and the other lines in the program take O(nmax (2,k)) time. So the whole program takes 
O(n1+max (2,k)) time. So this program is polytime. 
Note: it’s OK to treat MC as taking O(1) time, as long as you state what you’re doing. If 
you do, you’ll get a running time of O(n3). If you go by total input size rather than setting n 
to be the number of nodes, you’ll get a slightly different number, but shouldn’t get more than 
a quadratic difference from what we got here. 
7. (15 marks) Let Longest-Path= 
{〈G, s, t, k〉 ∣∣G is a directed graph with a path of length > k from s to t}. 
(a) (5 marks) Show that Longest-Path ∈ NP. 
Solution: 
1) This is a language, so its decision problem is a yes/no question. 
2) A certificate p for this language would be a path v1 = s, v2, . . . , v` = t from s to t of 
length > k. 
3) This certificate is a list of nodes in G, so its size is |p| 6 |V |. So the entire certificate is 
polynomial in the size of the input. 
7 
4) A verifier V for this language would be: 
V = “On input 〈G = {V,E}, s, t, k, c〉: 
1. Check that every node in p is in V , and occurs no more than once in the path. 
2. Check that the first node in p is s and that the last node is t. 
3. Check that the path has > k nodes (i.e., that the length of the path is > k). 
4. For vi, vi+1 ∈ p: 
5. Check that vvvi+1 ∈ E. 
6. If all of these checks pass, accept, otherwise reject. 
5) If n is the number of nodes in the graph, then: 
• We can have a p that has up to n nodes, and in line 1 we will have to compare each of 
these nodes to each other (and to the nodes in V ). So line 1 takes O(n2) time. 
• Lines 2 and 3 together look at each node in p a fixed number of times, so we get a time 
of O(n). 
• The loop runs at most n− 1 times, and in a reasonable (say, adjacency matrix) repre- 
sentation, each iteration will take O(1) time. So the loop takes O(n) time. 
• Line 6 takes O(1) time. 
• So altogether, the verifier takes O(n2) time. 
So the verifier runs in polytime. 
(b) (10 marks) Assuming that HAM-PATH is NP-complete, show that Longest-Path is NP- 
complete. 
Solution: 
We know from part a) that Longest-Path is in NP. So we just need to show that it is 
NP-hard. We’ll do this with a reduction from HAM-PATH. 
Let P = “On input 〈G = (V,E), s, t〉: 
1. Let k′ = |V | − 1. 
2. Return 〈G, s, t, k′〉.” 
Proof that 〈G, s, t〉 ∈HAM-PATH iff 〈G, s, t, k′〉 ∈Longest-Path: 
⇒) Suppose that G has a Hamiltonian path p from s to t. 
Then, p passes through each vertex exactly once, and so it has |V | − 1 = k′ edges. 
So G has a path of length k′ from s to t. 
So 〈G, s, t, k′〉 ∈Longest-Path. 
⇐) Suppose that G has a length-k′ path p from s to t. 
Then, since p can pass through any vertex at most once, it touches k′+ 1 vertices exactly 
once. 
Since k′ = |V | − 1, p must hit all |V | vertices. 
So p is a Hamiltonian path from s to t. 
So 〈G, s, t〉 ∈HAM-PATH. 
8. Bonus (10 marks — your mark will be rounded to the nearest multiple of 2.5) 
Suppose you had an oracle MO that would tell you in a single step whether at least half of the truth 
assignments for a boolean formula φ would return true. Use this oracle to build a polytime algorithm 
that can tell, given an input boolean formula φ, exactly how many of its truth assignments are satisfying 
assignments. 
8 
Solution: 
The first thing to notice is that, given a Boolean formula F , calling MO on F will tell us if half of 
more of the truth assignments to F accept, and calling MO on ¬F will tell us if half or less of the 
truth assignments to F are accepting. So two calls to MO will tell us whether F is satisfied by 
fewer than half of its assignments, half of its assignments, or more than half of its assignments. 
We’ll argue that copying out ¬F takes linear time, so we can do the above two checks in linear 
time. (*) 
So, given a Boolean formula φ, we’re going to use (*) to run a binary search to find the number 
of satisfying truth assignments to φ. 
Suppose that we are given a Boolean formula φ with k variables. So there are 2k possible truth 
assignments. 
Given any i, 0 6 i < 2k, we’ll denote by ψi,k the Boolean formula on k variables that reads the k 
variables as a binary number, and outputs true iff that number is less than i. 
It may not be immediately clear that this is possible, but youll see it’s not too hard: e.g., if k 
were 4 and i were 11, we’d want all numbers at most 
10 = 10102. 
It’s easy enough to build this formula: the number can start with either 0 or 1. If the number 
starts with 0 (the first bit is 0) then it’s at most 8, so we accept. If it starts with 1, then the 
second bit must be 0, otherwise it would be larger than 12. We can continue with this process 
to get 
ψ10,4 = x1 ∨ (x1 ∧ x2 ∧ (x3 ∨ (x3 ∧ x4))). 
The fromula ψ0,k would always be false, e.g., (x1 ∧ x1). With this construction, we can see 
that the number of truth assignments satisfying ψi,k is exactly i. We’ll argue that we can do 
this construction in linear time. 
Let Fi be the Boolean formula (b ∧ ψi,k) ∨ (b ∧ φ), where: 
• ψi,k is as above, but which uses k new variables — there’s no overlap with the variables in φ. 
• φ is our input. 
• b is a single Boolean variable not used in either φ or ψi,k. 
We can see that Fi will have 2k + 1 variables, and so it will have 2 
2k+1 truth assignments. If 
MO indicates that exactly half of its assignments are satisfying (as per (*)), then the number of 
satisfying truth assignments is 22k+1/2 = 22k. 
Let’s denote the number of satisfying truth assignments to φ as c. Then, the number of satisfying 
truth assignments to Fi are of one of two forms: 
• b = 1 and ψi,k is true, or 
• b = 0 and φ is false. 
• If b = 1 and ψi,k is true, there is one way b can be set (b = 1), and there are i ways that the 
variables in ψi,k can be set (this is how many numbers are between 0 and i). 
In this case we don’t care how the variables in φ are set, so they can take all 2k possible 
values. 
So there are 1× i× 2k = i2k truth assignments of this form. 
• If b = 0 and φ is true, there is one way b can be set (b = 0), and there are 2k − c ways that 
the variables in φ can be set. 
In this case we don’t care how the variables in ψi,k are set, so they can take all 2 
k possible 
values. 
So there are 1× (2k − c)× 2k = 22k − c2k truth assignments of this form. 
9 
The two types of truth assignments are disjoint — you can’t have b = 0 and b = 1 — and so the 
total number of truth assignments to Fi is 2 
2k + (i− c)2k. Now, 
• If i < c, (i− c) < 0, and so the number of sasisfying truth assignments to Fi is less than 22k 
(less than half). 
• If i > c, (i− c) > 0, and so the number of sasisfying truth assignments to Fi is more than 22k 
(more than half). 
• If i = c, (i − c) = 0, and so the number of sasisfying truth assignments to Fi is exactly 22k 
(exactly half). 
So here’s our algorithm: 
V = “On input φ: 
1. Let Found = false; Low = 0; High = 2k. 
2. While not Found: 
3. Let i = b(High + Low)/2c; Fi be as described above. 
4. Run MO on Fi and ¬Fi. 
5. If it indicates that exactly half of the truth assignments are accepting, return i. 
6. If less than half of the truth assignments are accepting, set High = i.” 
7. If more than half of the truth assignments are accepting, set Low = i+ 1.” 
Since this is just a binary search on the is, it will terminate in O(log (2k)) = O(k) loop iterations, 
and by the arguments aboce, when it does we will know that the number of truth assignments in 
φ will be exactly i. 
We can see that each loop iteration will take O(n) time, where n is the size of φ. Moreover, there 
are O(k) = O(n) loop iterations. So the whole process takes O(n2) time.