首页 > > 详细

APS106 2020S Final Project

 APS106 2020S Final Project Due date: April 18, 11:59 pm EDT

Crunching the Numbers on COVID-19
Preamble 
The Final Project focuses on the reading and manipulation of data on the worldwide spread of 
COVID-19. As engineers around the world, including at UofT, are developing treatments, 
building ventilators, redesigning supply chains, and repurposing production lines, students of 
APS106 will get an opportunity to access real data on COVID-19 cases, manipulate data into a 
form that can be visualized, and develop data structures that allow quick access to the data. We 
hope that this task will give you a sense of the power of coding to gather and display data and 
help in understanding important worldwide events.
Submit the Final Project via the MarkUs system by Saturday April 18th, 11:59pm EDT
Each part of the project is graded based on the following criteria: 
• a “pre-project” document (called project_report.pdf) specifying an algorithm 
plan and a programming plan. As with the pre-labs, these plans should be created before 
you start coding and do not have to be exactly what you do. Marking is based on the 
extent to which you have demonstrated thought about the task and have a clear approach 
to attacking it – even if that approach is modified by what you actually do/code. See 
Details of the Pre-Project Document on page 6.
• automated tests we run via MarkUs
• a set of tests that you create: you will submit tests for your code and accompanying 
documentation explaining what each test accomplishes. Grading is based on the 
comprehensiveness of your tests and the logic of your explanations. See A Note About 
Tests on page 7.
• the quality of your code: clarity, simplicity, meaningful variable names, documentation, 
etc. 
Each part of this project has a different degree of difficulty; however, each is worth the same 
number of marks. See the individual “Marking Scheme” in each part below. 
An important additional difference from the labs: MarkUs will evaluate your code with tests that 
we write (just like in the labs). However, we are not providing those tests to you. We are 
providing several very simple MarkUs tests – see A Note About Tests on page 7. It is up to you 
to create tests for your code.
The project problem comes with a package of files that includes: a data file (in csv format), a 
title page for your pre-project document, starter code (in project.py), and helper functions 
that we have written for you (in project_helper.py). Please unzip the 
project_problem.zip file. See Description of the Starter Package on page 7.
IMPORTANT: DO NOT CHANGE ANY FILENAMES
APS106 2020S Final Project Due date: April 18, 11:59 pm EDT
2
Problem
To complete this project, you will build a program that reads in data on the geographic spread of 
COVID-19 over time and then filters the data to build data structures that can be used for 
visualization and efficient access. 
Part 1 – Parsing COVID-19 Data [15 Marks] 
In this part, you will complete the parse_covid19 function which reads a file containing data 
on the number of COVID-19 cases worldwide. The argument to the function is: filename, a 
string, the filename for a csv data file.
The csv file will have the following format:
Province_State,Country_Region,Last_Update,Confirmed,Deaths
,Afghanistan,2020-04-03,281,6
,Afghanistan,2020-04-04,299,7
Alberta,Canada,2020-04-03,969,13
Alberta,Canada,2020-04-04,1075,18
Ontario,Canada,2020-04-03,3255,67
Ontario,Canada,2020-04-04,3630,94
,Norway,2020-04-03,5370,59
,Norway,2020-04-04,5550,62
Notes:
• The file has a header row as shown above. 
• Each row of the file after the header contains the total number of confirmed cases and 
deaths for a given country and given date. There are multiple entries for the same country 
for different dates. There are also multiple entries for the same date and same country
when the data has been broken into regions inside the country. 
• Some of the rows may have an empty cell in the first column (e.g., the first row above 
starts with a comma, indicating that the first cell is empty).
• You can make no assumptions about the order of the data in the file (i.e., it is not 
necessarily alphabetical by country name nor in chronological order).
The parse_covid19 function returns a dictionary with the following format:
{country : [(date, num_cases), (date, num_cases), ...], ...}.
In terms of data types, the dictionary has the following format:
{str : [(str, int), (str, int), ...], ...}.
The keys are strings corresponding to country name and the values are a list of tuples where each 
tuple is:
• a string representing a date (in YYYY-MM-DD format). This is the Last_Update
date from the corresponding entry in the csv file.
• an integer which is the sum of the number of confirmed cases on that date for that 
country. If a country has multiple entries for a given date, those entries must be summed.
APS106 2020S Final Project Due date: April 18, 11:59 pm EDT
3
The tuples do not have to be in any particular order in the list.
Example:
{'Afghanistan': [('2020-04-03', 281), ('2020-04-04', 299)], 'Canada': 
[('2020-04-03', 4224), ('2020-04-04', 4705)], 'Norway': [('2020-04-
03', 5370), ('2020-04-04', 5550)]}
Part 1 Marking Scheme
Component Value
Pre-project 4 marks
Automated MarkUs Tests 8 marks
Student tests and explanations 1 mark
Code quality 2 marks
Part 2 – Selecting Countries [15 Marks] 
In this part, you will complete the select_countries function which takes a list of 
countries and a dictionary like the one returned by parse_covid19 and returns another 
dictionary of the same format containing only entries for countries in the list.
The arguments to the function in the following order are:
• country_names: a list of strings. Names of some countries.
• covid19_data: a dictionary like the one returned by parse_covid19. It is not 
necessarily the case that all countries in country_names have entries in 
covid19_data. Just skip such countries. 
The select_countries function returns a dictionary with the same format as 
covid19_data.
Part 2 Marking Scheme
Component Value
Pre-project 3 marks
Automated MarkUs Tests 9 marks
Student tests and explanations 1 mark
Code quality 2 marks
SPECIAL VISUALIZATION: The output of select_countries can be passed to the 
display_growth() function that has been written for you (see project_helper.py) to 
graph the growth of COVID-19 over time in the selected countries.
Part 3 – COVID-19 Country Class [15 Marks] 
In this part, you will define a class called Covid_Country that will store all the data about 
COVID-19 cases for a specified country. 
APS106 2020S Final Project Due date: April 18, 11:59 pm EDT
4
An instance of Covid_Country will have two data attributes:
• country_name: the name of the country, a str
• daily_count: the daily number of confirmed cases for the country, a list of tuples of 
the same format as the dictionary values in Parts 1 and 2: [(date, num_cases), 
(date, num_cases), ...].
Covid_Country objects are created as follows:
>>> Norway = Covid_Country("Norway", covid19_data)
The constructor takes in a country name, as a string, and a dictionary, like the one generated by
parse_covid19. The data attribute country_name is assigned to the string and the 
daily_count attribute is set to be the COVID-19 data (i.e., all daily confirmed cases) for the 
country in country_name in the dictionary passed in. Technically, daily_count will be an 
alias to the value in the dictionary. There are no additional or optional arguments to the 
constructor. You can assume that an entry for country_name will exist in the passed-in 
dictionary.
This class will also have a method day_count(date) that takes in a date, a string in the 
format YYYY-MM-DD, and returns the number of confirmed cases for that date for the country.
If the date does not exist for that country, the method should return None.
This is what a sample instantiation of Covid_Country and a call to the day_count()
method looks like with a dictionary created (in Part 1) from the sample csv file shown above:
>>> Norway = Covid_Country("Norway", covid19_data)
>>> print(Norway.country_name)
>>> Norway
>>> print(Norway.daily_count)
>>> [('2020-04-03', 5370), ('2020-04-04', 5550)]
>>> print(Norway.day_count('2020-04-04'))
>>> 5550
Part 3 Marking Scheme
Component Value
Pre-project 3 marks
Automated MarkUs Tests 9 marks
Student tests and explanations 1 mark
Code quality 2 marks
APS106 2020S Final Project Due date: April 18, 11:59 pm EDT
5
Part 4 – Building a Binary Tree [15 Marks] 
In this part, you will create a class Split_Node and implement two methods: the constructor 
and build_tree. The tree will organize a list of Covid_Country objects (see Part 3) into 
a binary tree that can be quickly used to identify countries with approximately the same number 
of cases on a given date.
The Split_Node class will have the following data attributes:
• split_number: a float 
• countries: a list of Covid_Country objects 
• left: a Split_Node indicating the left child in the binary tree
• right: a Split_Node indicating the right child in the binary tree
The first method of Split_Node you will create is the constructor. There are no optional 
arguments to the constructor and the method should set all data attributes to appropriate default 
values that you choose. The constructor should enable a Split_Node object to be created with 
the following code:
>>> sp_node = Split_Node()
The second method is build_tree(country_list, date): country_list is a list
of Covid_Country objects and date is a string in YYYY-MM-DD format. You can assume 
that country_list will not be empty and that each element of country_list will have an 
entry for date.
The build_tree method does the following:
1. Calculates the mean number of cases across all countries in the list on date and assigns that 
value to split_number. This calculation must use the
Covid_Country.day_count() method.
2. Divides country_list into three lists:
a. List 1: Countries with a day_count() strictly less than 0.7 * split_number.
b. List 2: Countries with a day_count() strictly greater than 1.3 * split_number. 
c. List 3: The rest of the countries from country_list.
3. If List 1 is not empty, it should be used in a recursive call to build_tree on a new 
Split_Node assigned to the left data attribute. If List 1 is empty, nothing should be 
done with it.
4. If List 2 is not empty, it should be used in a recursive call to build_tree on a new 
Split_Node assigned to the right data attribute. If List 2 is empty nothing should be 
done with it.
5. List 3 should be assigned to the countries data attribute of the current Split_Node.
List 3 could be empty, that is fine.
6. The return value from build_tree is None.
APS106 2020S Final Project Due date: April 18, 11:59 pm EDT
6
Part 4 Marking Scheme
Component Value
Pre-project 4 marks
Automated MarkUs Tests 8 marks
Student tests and explanations 1 mark
Code quality 2 marks
SPECIAL VISUALIZATION: After running build_tree, a Split_Node object can be 
passed to the display_tree() function that has been written for you (see 
project_helper.py) to print out the groups of countries with similar counts. 
Details of the Pre-Project Document 
The pre-project document (like your “pre-labs”) should contain your algorithm plan and 
programming plan for each section of the project. Only one pre-project file can be submitted. 
The document must be in PDF format and typed (e.g., using Microsoft Word or other word 
processing software) – do not submit a picture of hand-written notes. Pictures of diagrams may 
be included: they must be embedded in the PDF and should be have a small enough memory size 
that the entire document meets the 5 Mb requirement below.
Your document must start with the title page and honour statement found in 
APS106_FinalProject_TitlePage.docx in the starter package. Your name and student 
number must appear on every page of the document. See Description of the Starter Package
below.
Your entry for each part of the project should not exceed approximately 500 words. This is a 
guideline meant to reflect our expectations and is meant to stop you spending too much time on 
this part of the project. Longer plans are not necessarily better – don’t go crazy.
Appendix
As explained in the next section, the tests you submit should be written as code in the 
project.py file. However, (hint, hint) some of the tests (*cough* Part 1) will likely also 
consist of small csv files. Such files should be included (copy-pasted) in your pre-project 
document together with the written explanation of what the test is assessing. For each test, you 
should include a title (e.g., “Part 1 Test 1”), a short explanation of the test, the name of the csv 
file, and the csv file itself. The csv files should be as short as possible – only large enough to test 
the intended functionality.
The entire pre-project document must be less than 5 Mb. 
MarkUs will not accept larger documents.
APS106 is not responsible for documents exceeding the 5 Mb limit.
 
APS106 2020S Final Project Due date: April 18, 11:59 pm EDT
7
 
A Note About Tests
You must submit tests for each part of the project. These tests are in the form of Python code
together with specified input and expected output and a written description of what the test does 
and why you proposed it. With the exception of possible csv files (see above), the tests should 
only be in the form of Python code and comments. These tests should be written and called from 
the run_my_tests() function in the project.py file (see below). If necessary, you may 
create other test functions, however each one should be called from run_my_tests(). 
The tests will be marked based on the extent to which they logically and comprehensively test 
the functionality of the code in each part of the project. Each part of the project specifies its 
marking scheme.
The tests do not have to automatically check if the output matches what is expected. While the 
input and expected output must be indicated in the test documentation, it is sufficient to manually 
compare the output to the expected output.
Unlike in your labs, we are not providing MarkUs tests to evaluate your code. It is your 
responsibility to create sufficient tests to be confident in the correctness of your code.
We are providing some very basic MarkUs tests to you. For Parts 1 and 2, the tests check:
• The submitted file contains a function with the expected name.
• The arguments to the function are correctly named and in the correct order.
For Parts 3 and 4, the tests check:
• The submitted file contains a class with the expected name.
• The class contains all expected methods.
• The arguments to the methods are correctly named and in the correct order.
These tests are not even close to sufficient for assessing if your code works. That is up to your 
tests. Our tests provide a “sanity check” to help you figure-out if you are on the right-track.
You will be able to run these tests in the same way you ran MarkUs tests for labs 2-9. That is, 
submit your code to MarkUs and then click on “Run Tests” under the automated testing tab.
After a few minutes, refresh the page and the results of the tests will be displayed.
 
Description of the Starter Package
Please download project_problem.zip from Quercus and unzip the file to access the 
starter package. Once unzipped, you will find a package that includes the following:
1. Data file (in csv format) 
This file contains information on the geographic spread of COVID-19 over time. This file 
(or any other csv file with the same structure as this file) will be read by
parse_covid19(). Please see Part 1 for a full description of the data file.
APS106 2020S Final Project Due date: April 18, 11:59 pm EDT
8
2. Starter code (in project.py)
This is where you will be writing your code for the final project. Please refer to the table 
below for a brief description of each component in the project.py file.
3. Helper functions that are already written for you (in project_helper.py)
This file includes two helper functions to allow you to visualize the output of your code. 
You do not need to modify this file. Please refer to the table for a description of the two 
helper functions provided.
4. The title page for your pre-project document 
(APS106_FinalProject_TitlePage.docx)
This file must be used as the first page in your pre-project document. Please fill it out 
with the required information, including your name and student number on each page.
Item Description
parse_covid19 function Write this function to solve Part 1 of the project. 
select_countries
function
Write this function to solve Part 2 of the project.
Covid_Country class Write this class to solve Part 3 of the project.
Split_Node class Write this class to solve Part 4 of the project.
run_my_tests function This function is provided to assist with testing your 
project. You must include all your test cases in this 
function. See A Note About Tests above for details.
run_tests flag Set this flag to True to run your test cases in the 
run_my_tests function. Set this flag to False if you 
do not want to run your test cases.
run_visualization
flag
As an optional activity, set this flag to True to visualize 
the output from the select_countries function 
graphed over time. You will also see a representation of 
your binary tree. To turn-off the visualization, set the flag 
to False.
if __name__ == 
"__main__":
You should not make any changes to the remainder of the 
project.py file. This code is used to run your tests and 
output visualizations of your project code.
display_growth 
function
This function is already written for you in 
project_helper.py. As an optional activity, pass 
the output of the select_countries into 
display_growth() to view a graphical representation 
of the growth of COVID-19 over time.
display_tree function This function is already written for you in 
project_helper.py. As an optional activity, pass a 
Split_Node object and date string into this function to 
print your binary tree.
Table 1: Description of the starter and helper code.
APS106 2020S Final Project Due date: April 18, 11:59 pm EDT
9
Submitting Your Project 
The APS106 Final Project is due by April 18th 11:59PM (EDT) on MarkUs. The late penalty 
is –20% per 24-hour period past the deadline; any submissions considered incomplete are subject 
to the late penalty until all parts are submitted correctly. 
All students are encouraged to carefully check and then submit their projects prior to the 
deadline, with all required components. 
Please note that technical difficulty is not in-itself a valid reason for late submissions: students 
are strongly encouraged to verify and submit their completed assignments early, rather than wait 
for the last minute. Issues such as overloading of the MarkUs system, exceeding the 5 Mb size 
limitation, intermittent outages are all considered part of the reason why we encourage students 
to submit early. 
Separately, the minimum 1-week completion time is in-place not as an indication that this project 
will take a week to complete (it should NOT take the majority of students that long to finish), but 
is rather so that it accounts for challenges faced with moving this to an online system. More time 
doesn’t necessarily mean a better product: work smartly and you can do this in a few days at 
most. 
The APS106 team will be using advanced similarity-checking software on all submissions and 
will notify students as per university regulations if academic misconduct is detected. We have
issued, and will continue to issue, Academic Misconduct penalties as appropriate, especially with 
this Final Project. 
Your work for this project will be submitted as two files:
1. A python file named project.py containing all the required functions and classes for 
each part of the project along with your tests and supporting documentation.
2. A PDF named project_report.pdf containing the pre-project document and 
appendix. The file size must be less than 5 MB. Larger files will not be accepted.
Both files should be submitted to the “Final_Project: Project” assignment on MarkUs. MarkUs 
will only accept submission of the two files listed above. After submission, you should review 
the submitted files listed under the “Submissions” tab and verify that both files correctly 
uploaded to MarkUs.
IMPORTANT: Do not change any file, function, class, or method names. Do not include any 
input() or print() statements in the submission of your project.py file.
Academic Integrity
The project is “open book” meaning that students are allowed to use a Python IDE (e.g. Wing 
101), all course material (lecture notes and videos, labs, textbook), and other offline and online 
resources with the following restrictions: 
APS106 2020S Final Project Due date: April 18, 11:59 pm EDT
10
• students are not to ask questions or otherwise consult any other person (either in the 
course or not in the course) other than the course instructors via the APS106 piazza site;
• students should not post answers to Piazza or any other bulletin board site on any topic 
that can reasonably be understood to be relevant to the final project;
• students are not to submit work not wholly created by the student;
• students should not copy code from anywhere;
• students are not to collaborate in any way – this is an individual assignment. 
Doing any of the above is an academic offense. We will be using tools to detect such offenses. 
Submission of your final assessment package constitutes agreement with the following 
statement.
In submitting this assessment, I confirm that my conduct adheres to the Code of 
Behaviour on Academic Matters. I confirm that I did not act in such a way that would 
constitute cheating, misrepresentation, or unfairness, including but not limited to, using 
unauthorized aids and assistance, impersonating another person, and committing 
plagiarism. I pledge upon my honour that I have not violated the Faculty of Applied 
Science & Engineering’s Honour Code during this assessment.
Given the difficult and unusual situation in which this final assessment is being administered, 
any academic offences will be pursued to the full extent of University regulations. Note that it is 
standard FASE policy that students who are found to have committed an academic offense are 
not allowed to subsequently drop a course. Please think about this given the CR/NCR and late 
drop policy that the FASE has adopted for the 2020S semester.
联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!