COMP 2402 W24 Lab 5 Specifications
PQs & Graphs & The Programming Interview (& Lists & Sets & Maps!)
Prelab: due on brightspace by Friday March 22nd, 3:00pm (no lates)
Programming: due on gradescope by Wednesday March 27th, 3:00pm (24h late ok)
Postlab: due on brightspace.ca by Wednesday April 3rd, 3:00pm (no lates)
Topic focus: Lec 1 - 19
Changes from Labs 1-4
Lab 5 is structured slightly differently than Labs 1-4 as its focus is more on algorithms than on data structure implementation. While you will still need to make careful choices about which data structures to use based on how you want to store and manipulate data, the algorithms that use the data structures are non-trivial (not that previous labs were trivial! But these might “feel different” .)
Another focus of the back part of the course is on the skills needed for the “programming interview.” While there is only one lecture explicitly on the programming interview, the principles therein have been on display throughout this course. You should always be trying the problem out on examples, breaking the problems into manageable (and testable!) chunks, trying to find an exhaustive (just-get-it-done correct) solution, and improving upon that solution in increments. As someone who used to conduct programming interviews, I can say with certainty that I was looking for these problem solving skills in my candidates and would make a note when a step was missing. Moreover, these will be very important for working on the Lab 5 problems. They are very good practice for “the programming interview,” if I do say so myself (and I do 9).
Finally, most of the problems in this lab are throwbacks to problems from previous labs. This gives you an opportunity to review those problems, and if you weren’t able to get them the first time around, it gives you an opportunity and an excuse to go back to look over the debrief and sample solutions and get it the second time around.
There are still no hidden tests. In a sense, the autograder variability serves as a “hidden test” in that you sometimes have to look at your code and determine its complexity (gasp!) and decide whether it matches the desired complexity rather than relying on the occasionally flaky tests.
As an incentive to start this lab early, I will give +0.1 bonus points for each part that is completed (16/16) a week in advance (by Thursday, March 21 3pm.) See the pinned piazza post labeled “Bonus Point Opportunities” to see how bonus is computed in this course.
Lab Objectives
The labs in this course are meant to give you the opportunity to practice with the topics of this course in a way that is challenging yet also manageable. At times you may struggle and at others it may seem more straight-forward; keep trying and practicing and you will improve.
Specifically, Lab 5 aims to improve your
1. Implementation Skills
While there isn’t a specific data structure implementation on this lab, you will need to review the implementations of NAStrand (Labs 1 and 2) and PhyloTree (Lab 4), as well as AdjacencyLists (for Reconstruct).
2. Design Skills (i.e. Programming Interview Skills)
Find a “get it done” solution first, then make iterative improvements. Break your problem into modular sub-problems that can be solved, tested, and debugged individually.
3. Critical Thinking
Demonstrate a solid understanding of the pros and cons of all the data structures seen so far (including different implementations), especially the PriorityQueue, Graph, Map, and Set, and Deque. This includes considering the time- and space-complexity of various operations in different data structures.
4. Algorithmic Thinking
Apply algorithmic thinking to design efficient algorithms for a variety of problems that involve lists, sets, maps, priority queues, trees, and graphs, as well as iterative versions of recursive algorithms.
5. Error Handling
Implement appropriate error handling and boundary checks, when appropriate.
6. Testing
Determine the necessary tests to ensure your algorithms are correct and efficient.
7. Planning
Maintain or adopt good academic and programming habits through the pre-lab.
8. Reflection
Reflect on your choice of data structures and algorithms through the post-lab.
Assignment Components
Details of each component follow later in the specifications.
1. (6.7 points) Prelab (complete on brightspace)
2. (80 points) Programming portion (submit on gradescope.ca)
a. (16 points) PhyloT ree.likelyTree implementation
b. (16 points) Bond implementation
c. (16 points) DecodeA implementation
d. (16 points) DecodeB implementation
e. (16 points) Reconstruct implementation
3. (13.3 points) Postlab (complete on brightspace)
Grading Criteria & Submission Guidelines
See the Grading Criteria and Submission Guidelines of Lab 1; they are the same.
Collaboration & Academic Integrity
1. Individual work is expected. Any collaboration should be explicitly mentioned and acknowledged at the top of each file. It is okay to discuss high-level approaches with your peers, or low-level syntax-type questions, but you must construct your solution on your own (as in: you have to formulate the code of your solution on your own.) Consider the analogy of writing an essay. You might talk with a peer about the high-level concepts of your thesis, or you might ask them about grammar or even phrasing of individual sentences. But you should not be writing the essay sentence-by-sentence with someone else’s help; in the end you have to sit down and write that thing on your own.
2. Plagiarism will result in severe consequences. Ensure that all code and documentation are your own work. Do not send code to or receive code from any source except for course staff or the textbook, even if you change a thing here or there. It helps to keep the analogy of an essay in mind; it is not okay to take a paragraph from a friend and then rearrange some of the words or replace some with a thesaurus. It’s not even okay to paraphrase each sentence. It is not okay to send your essay to a peer. Similarly here, you cannot start with code that is not yours and then “make it your own” with minor edits. Automated tools for detecting plagiarism will be employed in this course.
3. The same restrictions apply to AI programmers (such as chatGPT, copilot). You can use them to help with basic syntax (e.g. “spelling” and “grammar”) or to understand broad concepts (e.g. getting feedback on a thesis) but you have to formulate your solution in code on your own (e.g. you have to write that essay yourself.)
4. Note that contract cheating sites are known, unauthorized, and regularly monitored. Some of these services employ misleading advertising practices and have a high risk of blackmail and extortion.
5. Every student should be familiar with the Carleton University student academic integrity policy. Academic integrity is upheld in this course to the best of Prof Alexa’s abilities, as it protects the students that put in the effort to work on coursework within the allowable parameters. Potential violations must be reported to the Dean of Academic Integrity. If you ever have questions about what is or is not allowable regarding academic integrity, please do not hesitate to reach out to course staff. We are happy to answer.
Copyright
Prof Alexa is the exclusive owner of copyright and intellectual property of all course materials, including the labs. You may use course materials for your own educational use. You may not reproduce or distribute course materials publicly for commercial purposes, or allow others to, without express written consent.
Workflow
In a perfect world, this is how you would complete Lab 5:
1. Attend or watch the relevant lectures that are listed in the heading of this document.
2. Read the lab objectives listed on the second page of this document. For each data structure listed there, review its important algorithms as well as the time- and space- complexity of its methods. This should give you some pros and cons for each.
3. Carefully read each problem detailed in the Programming section of this document.
a. Make sure you understand the problem.
b. Try the given examples by hand to get a better understanding of the problem.
c. Pay special attention to any special cases or edge cases.
d. Try more examples of your own devising if you need them.
e. Do not start programming yet!
f. Consider attending or watching the lab’s workshop video, posted on brightspace.
4. Once you have completed steps 1-3, you are ready to do the prelab (brightspace).
5. Complete the programming portion of the lab.
a. Take this one problem at a time. Any order should be okay.
b. Remember what you learned in Steps 1-4 as you brainstorm solutions.
c. Test locally at frequent intervals. Do not write a whole program then test it afterwards. See the section on Local Tests to help you here.
d. Submit to gradescope.ca whenever you have made good progress, but do not use gradescope.ca as your only tests. Gradescope keeps your most recent score unless you select a different submission to be active. See the section on the Gradescope Autograder to help you debug here.
e. If you’re stuck on a problem for more than 30 minutes, ask for help using the How to Get Help section. Move on to something else until help has arrived.
6. Once the late programming deadline has passed, complete the post-lab (brightspace).
a. If you did well on the programming portion, this is not meant to take too long.
b. There are resources available to help you posted on brightspace under the Lab 5
module. There is a solutions walk-through video for each part, and also a debrief document where I walk through the problem solving process and learning engagement I was hoping you would experience. You might consider looking at these before completing the prelab if you had trouble with any of the programming parts.
c. You do not have to have completed the programming parts in order to do the postlab. If you were stuck on a problem, this is an opportunity to look at the sample solution videos, to figure out what went wrong for you, and to still learn what you were meant to learn.
Coding Environment Setup
Lab 1’s Coding Environment Setup will work with Lab 5 (replace Lab1/l1 with Lab5/l5). The file structure and many of the files will be the same, with new and different files as well.
Programming Components
Programming Notes
1. For some problems, some of java’s data structures are available to you. Here is java’s official documentation for the relevant data structures:
● TreeSet
● TreeMap
● HashMap
● HashSet
● PriorityQueue
2. If you want to store elements in a java Collection but use their “reverse” sorted order, many offer a constructor that takes in a Comparator, and if you pass in Collections.reverseOrder() that will set the default Comparator to be the flip of what the default is. (e.g. any Collection of Comparables will offer this kind of constructor.)
3. The Programming Notes section of Lab 4 has information about writing your own Comparators or Comparable classes, if you decide you need them.
4. Note that you should not need recursion for any of the problems. You can/should run all tests with the -Xss144k flag (or, the smallest heap size your local machine will allow.) Sometimes you’ll pass such tests even with recursion.
5. You can turn a ch ar c into a String using “”+c or String.valueOf(c)
Practice [80 marks]
Bond (Bond and RNAStrand, Revisited) [16 marks]
Method Signature
public int bond(InputGenerator<Character> gen)
Method Behaviour & Notes
Returns the maximum number of bonds that an RNA sequence given by input sequence over A, C, G, U could form, where bonds have to satisfy the following 4 properties:
1. every character is bonded with at most one other character;
2. A bonds with U and C bonds with G, in either order (they are valid RNA pairs according to Lab 1’s isPair);
3. there are no “tight turns” in the bonds, i.e. if (i,j) are the indices of a bond then j-i>4
4. the bonds cannot “cross” , i.e. if (i,j) and (k,m) are indices of two bonds, then you cannot have i < k < j < m; see the examples below for a visual.
The maximum number of bonds in the sequence bi ,bi+1 ,...,bj-1 ,bj is given by the following recurrence:
bonds (i,j)=0 if j-i ≤ 4
bonds (i,j)=max{i≤t<j} [bonds (i,t-1)+bonds (t+1,j-1)+1] if isPair (bi ,bt ) bonds (i,j)=bonds (i,j-1) if !isPair (bi , bj )
We want to return bonds (0,n-1), where n is the length of the input sequence.
The recurrence above can be turned into a recursive method in a straightforward way; your job is to correctly implement it iteratively by unwinding the recursion (start with the base cases, i.e. short intervals of j-i then increase the interval length.)
You might consider first programming a get-it-done recursive solution, and then unwinding the recursion to an iterative solution.
Note that your only allowable import is RNAStrand; you will use (and submit) one of your RNAStrand/NAStrand implementations (from either Lab 1 or Lab 2) with the modification that it should be in package comp2402w24l5 (and you need fewer implemented methods). Once you figure out how you’re accessing your RNAStrand, you will want to make the choice that is best for your time and space complexity.
Desired Complexity
O(n3 ) time, where n is the number of bases generated.
O(n2 ) space.
Examples
(source)
Testing & Autograder
There are limited local tests in Bond.main and tests/BondTest.java; see Lab 1 for testing instructions. Submit Bond.java and your preferred RNAStrand.java and NAStrand.java to gradescope; see Lab 1 for submission instructions.
LikelyTree (PhyloTree, Revisited) [16 marks]
In this problem you will implement one more method in PhyloTree from Lab 4. In order to get this working you will first need working versions of addChild, the array-based PhyloT ree constructor, and compute Sets. If you did not get these working on Lab 4, now is your chance to look at the debrief document and solutions videos to get a working version!
Method Signature
public void likelyTree()
For this problem, you may assume each Node has a length-k String of DNABases (or is null).
(You can use the provided class field DNABases = {A,C,G,T} here.)
Method Behaviour & Notes
Initially, all the leaves will be non-null; the internal nodes may or may not be null.
Given the current tree, construct the most likely “family” tree for the given list of leaf nodes by overwriting the Strings at the internal nodes.
Do not use recursion.
Throws an IllegalArgumentException if any of the leaves are null.
Use the following algorithm to compute the most likely “family” tree for each index i:
for each node v, compute its set of possible bases for index i for each node v in pre-order (working from root downwards)
if v is the root
set v ’s index i t o be the smallest element in v ’s set
else if the v.parent ’s index i is in v ’s set
set v ’s index i t o be the same as v.parent ’s index i
else
set v ’s index i t o be the smallest element in v ’s set
Desired Complexity
O(nk2 ) time, where n is the number of nodes in the tree, and k>0 is the length of the non-null Strings in the tree.
O(1) space.
Examples
compute Sets
likelyTree [source]
Testing & Autograder
There are limited local tests in PhyloTree.main and tests/PhyloTreeTest.java; see Lab 1 for testing instructions. Submit PhyloTree.java to gradescope; see Lab 1 for submission
instructions. Note: I’ve left all the relevant tests from Lab 4 in the local tests and the autograder but you will need to add likelyTree to your Lab 4 code in order to pass the Lab 4 tests.
DecodeA [16 marks]
Method Signature
public static int decodeA(InputGenerator<Integer> gen, int k)
Method Behaviour & Notes
Returns an integer decoding of a given input sequence over positive Integers, produced as follows:
Out of the ≤k most recent distinct integers prior to the current one, consider the integer whose most recent occurrence is farthest back (i.e. its most recent index is smallest). Multiply this integer by the current index in the generated input, and add that to the output decoding.
You may assume k>0.
Note: First try to find an exhaustive (“get it done”) solution to the subproblem of computing the integer whose most recent occurrence is the farthest back. Then worry about computing the sum. Then worry about improving its time and space complexity.
Desired Complexity
O(n log d) time, where n is the number of integers generated and d=min{k, number
of distinct elements in the input}
O(d) space.
Examples
Testing & Autograder
There are limited local tests in DecodeA.main and tests/DecodeATest.java; see Lab 1 for testing instructions. Submit DecodeA.java to gradescope; see Lab 1 for submission instructions.
DecodeB [16 marks]
Method Signature
public static int decodeB(InputGenerator<Integer> gen, int k)
Method Behaviour & Notes
Returns an integer decoding of a given input sequence over positive Integers, produced as follows:
For each integer in the sequence, examine up to k integers just before it. That is, if there are less then k characters before the current character, consider the maximum number of characters in this interval. Consider the integer that occurs most frequently in this interval, multiply this frequency by the current index, and add that product to the decoding.
You may assume k>0.
I would recommend first writing an “exhaustive” (just get-it-done) solution without worrying about efficiency. Then, incrementally replace parts of your exhaustive program with the appropriate data structures that will help you solve parts of the problem faster.
Desired Complexity
O(n log k) time for all the glory and all the points. This is a challenge with some creativity. Otherwise, aim for < O(nk) for much glory and slightly fewer points. O(k) space.
Examples
Testing & Autograder
There are limited local tests in DecodeB.main and tests/DecodeBTest.java; see Lab 1 for testing instructions. Submit DecodeB.java to gradescope; see Lab 1 for submission instructions.