CSCC63 Assignment 2
Induction and Recursive Definitions
Due 11:59pm, March 6
Warning: Your electronic submission of a PDF to Crowdmark (through Quercus) affirms that this as-
signment is your own work and no one else’s, and is in accordance with the University of Toronto Code of
Behaviour on Academic Matters, the Code of Student Conduct, and the guidelines for avoiding plagiarism
in CSCC63. Note that using Google or any other online resource is considered plagiarism.
1. (10 marks) Let INFINITE =
{〈M〉 ∣∣ |L(M)| =∞}. Show that INFINITE is neither recognizable nor
co-recognizable by proving that ALLTM 6mINFINITE.
Solution:
Let w1, w2, . . . be an effective enumeration of Σ
∗.
Let P = “On input 〈M〉:
1. Let M0 = “On input 〈x〉:
1. For wi = w1 to w|x|:
2. Run M on wi.
3. Accept.”
2. Return 〈M0〉.”
Proof that 〈M〉 ∈ ALLTM iff 〈M0〉 ∈INFINITE:
⇒) Suppose that 〈M〉 ∈ ALLTM :
⇒ M halts on every input.
⇒ M halts on the strings w1 to wk for every k ∈ N.
⇒ M halts on the strings w1 to w|x| for every input 〈x〉 to M0.
⇒ M0 accepts every input.
⇒ |L(M0)| =∞.
⇒ 〈M0〉 ∈INFINITE.
⇐) Suppose that 〈M〉 /∈ ALLTM :
⇒ There is some wk on which M loops.
⇒ There is some input x0 to M0 such that |x0| = k. M will then loop on the input w|x0|.
⇒ For all inputs x longer than x0, M0 will eventually try to run M on wx0 in line 2, and will
loop.
⇒ |L(M0)| < |x0|Σ|x0|
(This is an upper bound on the number of strings of length x0 — the extra factor is a quick
way of dealing with the possibility of a unary alphabet.)
⇒ |L(M0)| <∞.
⇒ 〈M0〉 /∈INFINITE.
1
2. (20 marks) Let L =
{〈M〉 ∣∣∃n ∈ N,M does not return n2 when run on 〈n〉}. Determine whether L is
decidable, recognizable, co-recognizable, or neither. Prove your answer without using Rice’s theorem.
Solution:
L is neither recognizable nor co-recognizable. We have two proofs below.
The easy way:
We’ll reduce from ALLTM . Since ALLTM is neither recognizable nor co-recognizable, this
will tell us immediately the L is also neither recognizable nor co-recognizable.
Let P = “On input 〈M〉:
1. Let M0 = “On input 〈n〉:
1. Run M on 〈n〉.
2. Return n2.”
2. Return 〈M0〉.”
Proof that 〈M〉 ∈ ALLTM iff 〈M0〉 ∈ L:
⇐) Suppose that 〈M〉 /∈ ALLTM :
⇒ M halts on every input.
⇒ For every n ∈ N, M halts on 〈n〉.
⇒ For every n ∈ N, M0 reaches step 2 and outputs n2.
⇒ 〈M0〉 /∈ L.
⇒) Suppose that 〈M〉 ∈ ALLTM :
⇒ There is some n0 on which M loops.
⇒ For this n0, M0 will loop in step 1, and so will not reach step 2.
⇒ For this n0, M0 will not output n20.
⇒ 〈M0〉 ∈ L.
The slightly longer way:
We could also just repeat the reductions to ALLTM from class, with minor modifications:
Proof that L is not recognizable (HALT 6m L):
Let P = “On input 〈M,w〉:
1. Let M0 = “On input 〈n〉:
1. Run M on 〈w〉.
2. Return n2.”
2. Return 〈M0〉.”
Proof that 〈M,w〉 ∈ HALT iff 〈M0〉 ∈ L:
⇐) Suppose that 〈M,w〉 ∈ HALT:
⇒ M halts on w.
⇒ For every n ∈ N, M0 reaches step 2 and outputs n2.
⇒ 〈M0〉 /∈ L.
⇒) Suppose that 〈M〉 /∈ HALT:
⇒ M loops on w.
⇒ M0 will never reach step 2 for any input 〈n〉.
⇒ M0 will not output n2 for any input 〈n〉.
⇒ 〈M0〉 ∈ L.
2
Proof that L is not co-recognizable (HALT 6m L):
Let P = “On input 〈M,w〉:
1. Let M0 = “On input 〈n〉:
1. Run M on 〈w〉 for n steps.
2. If it halts loop, otherwise return n2.”
2. Return 〈M0〉.”
Proof that 〈M,w〉 ∈ HALT iff 〈M0〉 ∈ L:
⇐) Suppose that 〈M,w〉 ∈ HALT:
⇒ M loops on w.
⇒ For every n ∈ N, M does not halt on 〈w〉 in n steps.
⇒ For every n ∈ N, M0 reaches step 2 and outputs n2.
⇒ 〈M0〉 /∈ L.
⇒) Suppose that 〈M〉 /∈ HALT:
⇒ M halts on w.
⇒ There is some n0 such that M halts on w in n0 step.
⇒ For this n0, M0 will loop in step 2.
⇒ For this n0, M0 will not output n20.
⇒ 〈M0〉 ∈ L.
3. (10 marks) Suppose that, for two languages A and B, A 6m B. If A is recognizable and the mapping
reduction is surjective, prove that B is also recognizable.
Solution:
Suppose that A 6 B, and that the mapping is surjective.
Then, there is some computable function f such that ∀x, x ∈ A↔ f(x) ∈ B.
Moreover, since the mapping is surjective, we know that for every y ∈ B, there is some x ∈ A
such that f(x = y), and for every y /∈ B, there is some x /∈ A such that f(x = y)(*).
Suppose further that A is recognizable. Then we know that the elements of A can be effectively
enumerated.
Let x1, x2, . . . be an effective enumeration of A.
We can write the following recognizer for B:
Let RB = “On input 〈y〉:
1. For i = 1 to ∞:
2. If f(xi) = y, accept.”
If y ∈ B, then there is some x ∈ A such that f(x) = y, by (*). Since A is recognizable, there is
some i ∈ N such that x = xi. Since f is computable, every loop iteration in RB will eventually
halt, and so RB will eventually reach this xi and accept.
If y /∈ B, then there is no x ∈ A such that f(x) = y, and so RB will never accept.
4. (10 marks) Show the class NP is closed under the Kleene star operation.
Solution:
Suppose that we have some L ∈ NP. Then there is a polytime verifier V such that, for every
x ∈ Σ∗, there is a polysize certificate Cx that V accepts iff x ∈ L.
Note that our indices in questions 2 and 3 are for the breaks between letters, not for the letters
themselves. That’s why they go from 0 to n, not n− 1.
3
To show L∗ ∈ NP, we say:
1. This is a decision problem.
2. If x ∈ L∗, then it is either empty, or it can be broken into substrings w1, . . . , wk such that
every wi ∈ L. If it is empty, we can verify it quickly, so we just need a certificate for non-empty
x.
If x 6= ε, the certificate C is:
• A set of indices showing where to break x into the substrings w1, . . . , wk.
• A set of certificates Cw1 , . . . , Cwk for L such that Cwi shows that wi ∈ L.
3. If |x| = n, then we can break x into at most n substrings. So we need to store at most n
indices and L certificates. Since each Cwi is polynomial in size, each is of size O(|wi|r) for
some r ∈ N. So the total size of the certificate is O(nr+1), and so the entire certificate is
polysize.
4. In order to verify the certificate, we use the following algorithm:
Let V ′ = “On input 〈x,C〉:
1. If x = ε, accept.
2. Else:
3. For i ∈ [1, . . . k]:
4. Run V on 〈wi, Cwi〉.
5. If it rejects, reject.
6. Accept.”
5. As we saw in the certificate, we only have to worry about k ≤ n, and so we have O(n)
iterations of the loop. Since V is a polytime verifier, it runs in O(nr′) time for some r′ ∈ N.
So each loop iteration takes O(|wi|r′) = O(nr′) time. The other operations are all constant
time, so the entire program takes O(nr′+1) time.
So L∗ ∈ NP.
5. (10 marks) Show the class co-NP is closed under the Kleene star operation.
Solution:
Suppose that we have some L ∈co-NP. So we know that L ∈ NP, and we want to show that(
L∗
) ∈NP.
The language
(
L∗
)
is the set of x such that, for all partitions of x into substrings w1, . . . , wk, at
least one of the wi is in L.
Since L ∈ NP, there is a polytime verifier V such that, for every x ∈ Σ∗, there is a polysize
certificate Cx that V accepts iff x ∈ L.
1. This is a decision problem.
2. The certificate C will be:
• A set S of pairs (i, j) of indices of x such that 0 ≤ i < j ≤ n for every i, j.
• A set Cxi:j of certificates for every substring xi:j of x whose indices are in S.
3. The certificate C consists of pairs of indices of x, along with a certificate. Since the certificate
is polysize, there is some r ∈ N such that the size of any index pair and its associated
certificate is O(nr). There are at most O(n2) possible substrings to choose from, and so the
overall certificate size is at most O(nr+2), and so the certificate C is also polynomial in size.
4. The important thing to see here is that we’re trying to show that, for all partitions of x into
substrings w1, . . . , wk, at least one of the wi is in L. If this is not the case, then there is at
least one partition of x into substrings such that each substring is in L. What we’ll do is try
to build such a partition, and accept iff we can be convinced that we can’t do so.
4
Now, if such a partition exists, it must have an initial string x0:j for some j. This string
cannot be in L, and so we can’t have a certificate Cx0:j for it. So we can remove from
our considerations any string x0:j that we can prove through our certificates cannot be the
first string in our partition. Once we’ve done so, there will be some (possibly empty) set of
indices that we can reach from index 0 in our partition. We’ll go to the next reachable index
down from 0 and apply the same principle – we should be able to find a next string in the
sequence. Once we’re done this process, we’ll have a list of indices we can reach from index 0
by using substrings that have not been proven to be in L. If this list contains the last possible
breakpoint n, we can build a partition that contains strings not proven to be in L3, and so
we reject. If we can’t reach index n, we know that there is no partition whose strings are all
in L3, and so we accept.
Note that our indices in questions 2 and 3 are for the breaks between letters, not for the letters
themselves. That’s why they go from 0 to n, not n− 1.
Let V ′ = “On input 〈x,C〉:
1. If x = ε, reject.
2. For (i, j) ∈ S:
3. Run V on 〈xi:j , Cxi:j 〉.
4. If it rejects, reject.
5. Make a list [0, . . . , n] of indices. Mark index 0 as reachable, and the others as unreachable.
6. Let i = 0.
7. While i < n:
8. For all j such that (i, j) /∈ S:
9. Mark j as reachable.
10. Let i be the next vertex marked as reachable. If no such vertex exists, set i = n.
11. If n is marked as reachable, reject. Else, accept.
5. Since there are at most O(n2) substrings considered, the loop from lines 2-4 runs at most
O(n2) times. If the verifier V takes nr′ time to run, this loop takes O(nr′+2) time. Creating
a list of indices in step 5. takes O(n) time. Finally, the loop from lines 7-10 runs at most n
times. each iteration takes at most O(n2) time to loop through the indices in S, and another
O(n) time to find the next reachable vertex.
So the overall time takes is O(nmax (r′+2,3)) = O(nr′+2) time (since r′ ≥ 1 if we want V to be
able to read the whole input).
So L∗ ∈ co-NP.
6. (25 marks)
Let Root-Clique =
{〈G = (V,E)〉 ∣∣G is an undirected graph with a clique of size at least √|V |}.
(a) (5 marks) Show that Root-Clique ∈ NP.
Solution:
1) This is a language, so its decision problem is a yes/no question.
2) A certificate c would be a set of nodes forming a clique of size at least
√|V |.
3) We can see that this certificate is a set of nodes in V , so |c| 6 |V |, and so this certificate
is polynomial in the size of the input.
5
4) A verifier V for this language would be:
V = “On input 〈G = {V,E}, c〉:
1. Check that every node in c is in V , and occurs no more than once.
2. Check that the |c| > k√|V |).
3. For every u ∈ C:
4. For every v 6= u ∈ C:
5. Check that uv ∈ E.
6. If all of these checks pass, accept, otherwise reject.
5) If n is the number of nodes in the graph, then:
• We can have a p that has up to n nodes, and in line 1 we will have to compare each of
these nodes to each other (and to the nodes in V ). So line 1 takes O(n2) time.
• Lines 2 checks the size of c, which takes O(n) time.
• The loops run at most O(n2) times, and in a reasonable (say, adjacency matrix) repre-
sentation, each iteration will take O(1) time. So the loop takes O(n2) time.
• Line 6 takes O(1) time.
• So altogether, the verifier takes O(n2) time.
So the verifier runs in polytime.
(b) (10 marks) Assuming that Clique is NP-complete, show that Root-Clique is NP-complete.
Solution:
We know from part a) that Root-Clique is in NP. So we just need to show that it is
NP-hard. We’ll do this with a reduction from Clique.
Let P = “On input 〈G = (V,E), k〉:
1. Let G′ = G.
2. Add |V |2 − |V | disconnected nodes to G′.
3. Take the first |V | − k of these extra nodes and add fully connect them
to form an (|V | − k)-clique.
4. Add an edge from every node in this clique to every node from the original G.
2. Return 〈G′〉.”
Proof that 〈G, k〉 ∈Clique iff 〈G′〉 ∈Root-Clique:
⇒) Suppose that G has a k-clique S.
Let S′ be the new (|V | − k)-clique added in the construction of G′.
Since every node in S′ is connected to every node from G, S∪S′ for an ((|V |−k)+k) = |V |-
clique.
Since
√|V ′| = √|V |2 = |V |, S ∪ S′ is a root-clique of G′.
So G′ ∈Root-Clique.
⇐) Suppose that G′ has a root-clique S′.
Then, since only |V | − k of the extra nodes are not disconnected, we know that at most
|V | − k of the nodes in S′ areof this type.
So at least k of the nodes in S′ nodes are in G. Let S = V ∩ S′.
Since |S| > k and S is completely connected, S is a k-clique in G.
So 〈G, k〉 ∈Clique.
Note: another natural way to do this reduction would be to break it into two cases: one
for when k <
√|V |, and one for when k > √|V |. If you’ve done that, it’s fine. Both
cases should look a little like what we’ve done above.
6
(c) (10 marks) Show how you could use an oracle MC for Root-Clique to find a certificate for it:
that is, use MC to build a polytime program that, given an instance 〈G = (V,E)〉 of Root-
Clique, will return a clique of size at least
√|V | if such a clique exists, and will return null
otherwise.
Solution:
Suppose that we had an oracle MC that could solve Root-Clique in polytime. Here is an
algorithm to find a clique of size >
√|V |:
P = “On input 〈G = {V,E}〉:
1. If |V | = 1, return V .
2. Run MC on 〈G〉.
3. If the answer if no, reject.”
4. Let G′ = G.
5. For v ∈ V :
6. Let G′′ = G′ with the edges touching v removed.
7. Run MC on 〈G′′〉.
8. If it outputs yes, set G′ = G′′.
9. Return the set of nodes in G′ that have non-zero degree.
If G has one node, then its one node is also a root-clique, so we can safely return the entire
vertex set. Otherwise, we can see that if G does not have a root-clique, P will return null. If
it does have a root-clique, it will have at least one after every loop iteration. So G′ will have
at least one root-clique when the loop terminates. If there is any node v that is not in this
root clique, the loop would have removed in in lines 7 to 9 in the loop iteration in which v
was considered. So the only connected nodes in G′ are those in the root-clique, and so the
algorithm returns a root-clique, as desired.
Suppose that MC runs in O(nk) time for some k ∈ N. Then, lines 1,3, and 4 all take O(n2)
time (remember that we can have O(n2) edges in a graph), while line 2 takes O(nk) time.
The loop runs one for every node in G, so it takes O(n) iterations. In these iterations, lines 6
and 8 take O(n2) time, and line 7 takes O(nk) time. So all told, the loop takes O(n1+max (2,k))
time, and the other lines in the program take O(nmax (2,k)) time. So the whole program takes
O(n1+max (2,k)) time. So this program is polytime.
Note: it’s OK to treat MC as taking O(1) time, as long as you state what you’re doing. If
you do, you’ll get a running time of O(n3). If you go by total input size rather than setting n
to be the number of nodes, you’ll get a slightly different number, but shouldn’t get more than
a quadratic difference from what we got here.
7. (15 marks) Let Longest-Path=
{〈G, s, t, k〉 ∣∣G is a directed graph with a path of length > k from s to t}.
(a) (5 marks) Show that Longest-Path ∈ NP.
Solution:
1) This is a language, so its decision problem is a yes/no question.
2) A certificate p for this language would be a path v1 = s, v2, . . . , v` = t from s to t of
length > k.
3) This certificate is a list of nodes in G, so its size is |p| 6 |V |. So the entire certificate is
polynomial in the size of the input.
7
4) A verifier V for this language would be:
V = “On input 〈G = {V,E}, s, t, k, c〉:
1. Check that every node in p is in V , and occurs no more than once in the path.
2. Check that the first node in p is s and that the last node is t.
3. Check that the path has > k nodes (i.e., that the length of the path is > k).
4. For vi, vi+1 ∈ p:
5. Check that vvvi+1 ∈ E.
6. If all of these checks pass, accept, otherwise reject.
5) If n is the number of nodes in the graph, then:
• We can have a p that has up to n nodes, and in line 1 we will have to compare each of
these nodes to each other (and to the nodes in V ). So line 1 takes O(n2) time.
• Lines 2 and 3 together look at each node in p a fixed number of times, so we get a time
of O(n).
• The loop runs at most n− 1 times, and in a reasonable (say, adjacency matrix) repre-
sentation, each iteration will take O(1) time. So the loop takes O(n) time.
• Line 6 takes O(1) time.
• So altogether, the verifier takes O(n2) time.
So the verifier runs in polytime.
(b) (10 marks) Assuming that HAM-PATH is NP-complete, show that Longest-Path is NP-
complete.
Solution:
We know from part a) that Longest-Path is in NP. So we just need to show that it is
NP-hard. We’ll do this with a reduction from HAM-PATH.
Let P = “On input 〈G = (V,E), s, t〉:
1. Let k′ = |V | − 1.
2. Return 〈G, s, t, k′〉.”
Proof that 〈G, s, t〉 ∈HAM-PATH iff 〈G, s, t, k′〉 ∈Longest-Path:
⇒) Suppose that G has a Hamiltonian path p from s to t.
Then, p passes through each vertex exactly once, and so it has |V | − 1 = k′ edges.
So G has a path of length k′ from s to t.
So 〈G, s, t, k′〉 ∈Longest-Path.
⇐) Suppose that G has a length-k′ path p from s to t.
Then, since p can pass through any vertex at most once, it touches k′+ 1 vertices exactly
once.
Since k′ = |V | − 1, p must hit all |V | vertices.
So p is a Hamiltonian path from s to t.
So 〈G, s, t〉 ∈HAM-PATH.
8. Bonus (10 marks — your mark will be rounded to the nearest multiple of 2.5)
Suppose you had an oracle MO that would tell you in a single step whether at least half of the truth
assignments for a boolean formula φ would return true. Use this oracle to build a polytime algorithm
that can tell, given an input boolean formula φ, exactly how many of its truth assignments are satisfying
assignments.
8
Solution:
The first thing to notice is that, given a Boolean formula F , calling MO on F will tell us if half of
more of the truth assignments to F accept, and calling MO on ¬F will tell us if half or less of the
truth assignments to F are accepting. So two calls to MO will tell us whether F is satisfied by
fewer than half of its assignments, half of its assignments, or more than half of its assignments.
We’ll argue that copying out ¬F takes linear time, so we can do the above two checks in linear
time. (*)
So, given a Boolean formula φ, we’re going to use (*) to run a binary search to find the number
of satisfying truth assignments to φ.
Suppose that we are given a Boolean formula φ with k variables. So there are 2k possible truth
assignments.
Given any i, 0 6 i < 2k, we’ll denote by ψi,k the Boolean formula on k variables that reads the k
variables as a binary number, and outputs true iff that number is less than i.
It may not be immediately clear that this is possible, but youll see it’s not too hard: e.g., if k
were 4 and i were 11, we’d want all numbers at most
10 = 10102.
It’s easy enough to build this formula: the number can start with either 0 or 1. If the number
starts with 0 (the first bit is 0) then it’s at most 8, so we accept. If it starts with 1, then the
second bit must be 0, otherwise it would be larger than 12. We can continue with this process
to get
ψ10,4 = x1 ∨ (x1 ∧ x2 ∧ (x3 ∨ (x3 ∧ x4))).
The fromula ψ0,k would always be false, e.g., (x1 ∧ x1). With this construction, we can see
that the number of truth assignments satisfying ψi,k is exactly i. We’ll argue that we can do
this construction in linear time.
Let Fi be the Boolean formula (b ∧ ψi,k) ∨ (b ∧ φ), where:
• ψi,k is as above, but which uses k new variables — there’s no overlap with the variables in φ.
• φ is our input.
• b is a single Boolean variable not used in either φ or ψi,k.
We can see that Fi will have 2k + 1 variables, and so it will have 2
2k+1 truth assignments. If
MO indicates that exactly half of its assignments are satisfying (as per (*)), then the number of
satisfying truth assignments is 22k+1/2 = 22k.
Let’s denote the number of satisfying truth assignments to φ as c. Then, the number of satisfying
truth assignments to Fi are of one of two forms:
• b = 1 and ψi,k is true, or
• b = 0 and φ is false.
• If b = 1 and ψi,k is true, there is one way b can be set (b = 1), and there are i ways that the
variables in ψi,k can be set (this is how many numbers are between 0 and i).
In this case we don’t care how the variables in φ are set, so they can take all 2k possible
values.
So there are 1× i× 2k = i2k truth assignments of this form.
• If b = 0 and φ is true, there is one way b can be set (b = 0), and there are 2k − c ways that
the variables in φ can be set.
In this case we don’t care how the variables in ψi,k are set, so they can take all 2
k possible
values.
So there are 1× (2k − c)× 2k = 22k − c2k truth assignments of this form.
9
The two types of truth assignments are disjoint — you can’t have b = 0 and b = 1 — and so the
total number of truth assignments to Fi is 2
2k + (i− c)2k. Now,
• If i < c, (i− c) < 0, and so the number of sasisfying truth assignments to Fi is less than 22k
(less than half).
• If i > c, (i− c) > 0, and so the number of sasisfying truth assignments to Fi is more than 22k
(more than half).
• If i = c, (i − c) = 0, and so the number of sasisfying truth assignments to Fi is exactly 22k
(exactly half).
So here’s our algorithm:
V = “On input φ:
1. Let Found = false; Low = 0; High = 2k.
2. While not Found:
3. Let i = b(High + Low)/2c; Fi be as described above.
4. Run MO on Fi and ¬Fi.
5. If it indicates that exactly half of the truth assignments are accepting, return i.
6. If less than half of the truth assignments are accepting, set High = i.”
7. If more than half of the truth assignments are accepting, set Low = i+ 1.”
Since this is just a binary search on the is, it will terminate in O(log (2k)) = O(k) loop iterations,
and by the arguments aboce, when it does we will know that the number of truth assignments in
φ will be exactly i.
We can see that each loop iteration will take O(n) time, where n is the size of φ. Moreover, there
are O(k) = O(n) loop iterations. So the whole process takes O(n2) time.