首页 > > 详细

讲解COMP2011-Assignment 4调试C/C++程序、C/C++辅导

2020/5/22 COMP2011 Assignment 4: COVID-19 Outbreak Data Analysis 2.0 
https://course.cse.ust.hk/comp2011/assignments/assignment4/ 1/4 
Introduction 
This programming assignment is an extension of the COVID topic from PA3. It aims at practicing the usage of class and data I/O operations. Your tasks involve 
implementing two classes and their member functions, reading data from CSV files with dynamic allocation, doing a bit of processing on the data using class member 
functions, and writing the output to CSV files. 
Background and data 
The background and motivation of this task are the same as that of PA3. You have collected some raw data, including numbers of confirmed cases and deaths. To 
help analyze the COVID-19 pandemic, you want to know more information, like morbidity and mortality rate (deaths / total cases). 
The data is created and maintained by Johns Hopkins University Center for Systems Science and Engineering. We are only using it for educational purposes. You may 
find the original data in the online source of the data on Github. In the assignment, we prepare one CSV file for you. This file is generated by merging two raw files, 
time_series_covid19_confirmed_global.csv and time_series_covid19_deaths_global.csv. It records the time series data of confirmed cases and deaths from 22, Jan 
2020 to 5, May 2020. Furthermore, we add area and population information for each country (or region). We disclaim all representations and warranties with respect 
to the data, including accuracy, fitness for use, and merchantability. 
Overview 
Workflow 
The procedure is as follows. First, read data from a CSV file (we provide two different source files for you to experiment with. pa4_data.csv is for final submission, 
pa4_data_short.csv is only for testing). Second, do processing on the time series data. The processing includes computing rates of confirmed cases and deaths 
normalized by area (per 1,000km^2), rates of confirmed cases and deaths normalized by population (per 1,000,000 people), and mortality rate for each country (or 
region). To normalize, you simply divide by the provided population or area, respectively. Third, store the results of each computed statistic into different CSV files. 
Please see the main.cpp and the sample output for specific details. 
Task 1: Read csv 
You are given a csv file which is the format of 
name,population,area,cases1,deaths1,cases2,deaths2,cases3,deaths3,... 
Each line records information of one country (or region). It includes the population (people), the area (km^2), the numbers of cases, and the numbers of deaths per 
day starting from January 22, 2020. You need to implement the function to read this csv file. 
int readcsv(Region* region, const char* csvFileName); 
In this function, an array of Region will be constructed to store all of the information read from the CSV file. The first parameter region denotes this array, passed 
by reference. csvFileName is the path to this file. The return value is the length of this array, i.e. the number of recorded countries (or regions), which will be used in 
task 3 later. 
In order to complete this task, you simply have to read in each line and then call and implement the readline() function in the Region class discussed in more 
detail in Task 2 below. That function will take a line that you read from the CSV and populate the data members of that instance of the class with the information from 
the file. So, please read Task 2 before starting to understand how that function is called. For better understanding, we provide the pseudocode: 
readcsvs{ 
load the CSV file; 
allocate for an array of Region; 
for each line in the CSV file: 
readline(); 
You should keep all data read from the file, i.e., you do not need to remove any days that have zero cases like PA3. 
You may assume the maximum length of one line is 2048. 
Task 2: Implement the Class functions 
You need to implement the functions of two classes, DayStat and Region. 
DayStat keeps a pair of cases and deaths statistics for a given day. Its definition is shown below: 
class DayStat 
private: 
double cases, deaths; 
public: 
DayStat(); 
DayStat(int _cases, int _deaths); 
DayStat(const DayStat d, int denominator); 
double mortalityRate() const; 
double getcases() const; 
double getdeaths() const; 
}; 
DayStat stores numbers (or rates) of confirmed cases and deaths of one certain day as its private member variables cases and deaths. It has three types of 
constructor functions. The first one is the default constructor and initializes two member variables to 0. The second one sets the member variables based on the 
passed parameters. The third one takes another DayStat and a denominator as input, computing rates normalized by the given denominator, and storing them as its 
member variables. Hints: The second constructor function is used when reading raw data. The third constructor function is used for computing rates given raw data 
and one denominator. 
COMP2011 Assignment 4: COVID-19 Outbreak Data Analysis 2.0 
Assignment 4 
Introduction 
Overview 
Notes 
Sample output 
Data visualization 
2020/5/22 COMP2011 Assignment 4: COVID-19 Outbreak Data Analysis 2.0 
https://course.cse.ust.hk/comp2011/assignments/assignment4/ 2/4 
mortalityRate computes mortality rate (percentage of cases that resulted in death) based on its two member variables. If there are no cases, you should set it 
directly to zero to avoid a division by zero error. Its minimum value should be zero, and the maximum should be 100. Finally, getcases() and getdeaths() are 
used to access the values of cases and deaths, respectively. (Because they are private members and we cannot access them outside the class.) 
The definition of Region is shown below. It stores the time series data of one country (or region). 
class Region 
private: 
DayStat *raw; 
char *name; 
int population; 
int area; 
int nday; 
 
DayStat *normPop; 
DayStat *normArea; 
double *mortality; 
 
public: 
Region(); 
~Region(); 
void readline(char *line); 
void normalizeByPopulation(); 
void normalizeByArea(); 
void computeMortalityRate(); 
void write(Stat stat) const; 
}; 
We use DayStat arrays to store time series data. The input "raw" data is stored in the raw dynamic array which you should create when reading from file. name, 
population, and area represent the name, the population, and the area of this country (or region). nday is the number of recorded days, i.e. the length of 
DayStat arrays. The rates normalized by population and area are stored as normPop and normArea, respectively. The array mortality stores the time series 
data of mortality. 
Region is the default constructor function which initializes members to 0 or nullptr where appropriate. The destructor function should release the memory. The 
function readline() takes one char array as input and fills in the information in the first five member variables. Note that you must dynamically allocate the raw 
array at this point. normalizeByPopulation() computes rates normalized by population (per 1,000,000 people) and stores them in normPop. 
normalizeByArea() computes rates normalized by area (per 1,000km) and stores them in normArea. computeMortalityRate() computes mortality rates and 
stores them in mortality. Finally, the last member function write() save time series data into CSV files (please see Task 3 before implementing it so that you 
have an idea of how it will be used). 
Please see todo.h and sample output for examples to check your understanding. 
Task 3: Write CSV 
Finally, you need to implement this function shown below. 
void writecsvs(const Region* region, int nRegions); 
The parameter region refers to the array of Region , each element of which contains the information of one country (or region). The parameter nRegions is the 
length of this array, i.e. the number of recorded countries (or regions). The value of nRegions comes from task 1. 
Please note that just like readcsv(), writecsvs() mainly implements the workflow that processes each element in the array region. As the variables of region 
are private, we can only modify them in the member functions of region. Therefore, the I/O operation is actually performed in write. 
For better understanding, we provide the pesudocode: 
writecsvs{ 
for each Region in the array: 
write(); 
You need to generate 7 CSV files in total, which contain numbers of confirmed cases (raw data), numbers of deaths (raw data), rates of confirmed cases (normalized 
by population), rates of deaths (normalized by population), rates of confirmed cases (normalized by area), rates of deaths (normalized by area), and mortality rate, 
respectively. We list the name of these 7 CSV files as an enum. 
enum Stat {CASESRAW, DEATHSRAW, CASESPOP, DEATHSPOP, CASESAREA, DEATHSAREA, MORTALITY}; 
Note: Please pay attention to the format of the numbers in your output. You should make sure that the numbers have 6 significant figures. (e.g. 0 and 123456e-6 are 
not the right format. They should be 0.000000 and 0.123456.) Please check if there are inf or nan in your outputs before you submit them. 
You may call fstream::precision() to set the precision in csv files. Please make sure the precision in your outputs is 6 (default value). 
You may call fstream::setf(ios::fixed) to control the format of output numbers. Please avoid scientific notations in your outputs. 
When writing to CSV files, you may need to append new lines on already existing files. You can use ios::app to control the writing mode. i.e. When you open the file 
in this way, ofstream fout("***.csv", ios::out|ios::app), you can write new lines to the file and it will not modify the existing content. Instead, it appends 
new lines at the end of existing content. 
Skeleton 
Note: There is a typo in the former version. In the todo.cpp, line 16, the type of the second parameter should be changed from int into double. We have 
updated the link below. 
The skeleton code can be downloaded here. Unzip it, and add all of them to your Eclipse project. The CSV file should be placed under the same folder as the source 
files. 
To reduce your workload, we have finished and provided main.cpp and todo.h. main.cpp contains the main procedure of the program while todo.h contains the 
definitions of the two classes and two functions. Please do not modify these two files. You only need to finish the implementation of these two classes as well as 
some helper functions in todo.cpp. 
Data visualization 
Submission 
Checking for memory leak 
FAQ 
Prepared by: Yishuo 
ZHANG 
2020/5/22 COMP2011 Assignment 4: COVID-19 Outbreak Data Analysis 2.0 
https://course.cse.ust.hk/comp2011/assignments/assignment4/ 3/4 
Notes 
You are NOT allowed to include any additional library. 
You are NOT allowed to use any global variable nor static variable like "static int x". 
You may use any library function that can be used without including additional libraries. (note: please make sure it is the case in our Windows Eclipse). 
You may create and use your own helper functions defined in todo.cpp itself. 
You can assume all input is valid and formatted exactly as described 
No checking nor validation of parameters is needed in general, unless specifically required in the description 
For any dynamic char array, you should make it just big enough to contain all the character content as well as the terminating null character \0 at the end. 
For example, to store "Hong Kong", you should allocate a char array of size 10. 
You should NOT assume the maximum number of characters of any field in the CSV file in your implementations, as we want to test your ability to adapt to 
using variable-sized dynamic arrays, although for file I/O simplicity we did assume the maximum line length is 2048 in the given readCSV function. 
Sample output 
You may download/read the generated console output and csv files. The zipped file contains 7 CSV files. You may want to use some difference checker such as 
this to compare your output with the sample output since the sample output is pretty long. 
We have also prepared a simpler test case for you to check your program. You may download the input CSV file pa4_data_short.csv, put that in your project 
folder, and uncomment a line in main.cpp to test using it. It should produce this sample console output and this output.zip. 
Please note that sample output, naturally, does not show all possible cases. It is part of the assessment for you to design your own test cases to test your 
program. Be reminded to remove any debugging message that you might have added before submitting your code. 
Please note that there is a blank line at the end of each CSV file. 
Data Visualization 
This part is just for fun. :) 
Now that you have the processed data in your output csv files, you may copy its content (you may read the content in the csv file in Eclipse or a notepad 
software), and paste it below to visualize the data after all the hard work. Have fun! 
Submission 
Deadline 
23�59�00, May 24 (Sun), 2020 
Covid Trend Visualizer 
Paste any of your seven output CSV files into the text field below and click update to visualize your own results. To avoid displaying a cluttered graph, you may want to 
only copy the lines for a few specific regions. As an example, below are lines for Mainland China, Italy, Spain, and United States. 
Note that this visualizer is just for you to explore your results and it is optional. No robust error checking on the CSV data has been implemented. Make sure the data is 
in the correct format with one region per line and with no extra lines. In case of a data input error when you click the update button, reload the page before trying 
again. 
Once the animation is complete, you may move your mouse over the large number label to move back and forth in the timeline. 
Update 
day 
lu 
104 
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 
10 
20 
50 
100 
200 
500 
1k 
2k 
5k 
10k 
20k 
50k 
100k 
200k 
500k 
1M 
2M 
US 
SpainItaly 
China 
China,548.000000,643.000000,920.000000,1406.000000,2075.000000,2877.000000,5509.000000,6087.000000,8141.000000,9802.000000,11891.000000,16630.000000,19716.000000,23707.000000 
,27440.000000,30587.000000,34110.000000,36814.000000,39829.000000,42354.000000,44386.000000,44759.000000,59895.000000,66358.000000,68413.000000,70513.000000,72434.000000,7 
4211.000000,74619.000000,75077.000000,75550.000000,77001.000000,77022.000000,77241.000000,77754.000000,78166.000000,78600.000000,78928.000000,79356.000000,79932.000000,8013 
6.000000,80261.000000,80386.000000,80537.000000,80690.000000,80770.000000,80823.000000,80860.000000,80887.000000,80921.000000,80932.000000,80945.000000,80977.000000,81003. 
000000,81033.000000,81058.000000,81102.000000,81156.000000,81250.000000,81305.000000,81435.000000,81498.000000,81591.000000,81661.000000,81782.000000,81897.000000,81999.0000 
00,82122.000000,82198.000000,82279.000000,82361.000000,82432.000000,82511.000000,82543.000000,82602.000000,82665.000000,82718.000000,82809.000000,82883.000000,82941.000000 
,83014.000000,83134.000000,83213.000000,83306.000000,83356.000000,83403.000000,83760.000000,83787.000000,83805.000000,83817.000000,83853.000000,83868.000000,83884.000000, 
2020/5/22 COMP2011 Assignment 4: COVID-19 Outbreak Data Analysis 2.0 
https://course.cse.ust.hk/comp2011/assignments/assignment4/ 4/4 
Canvas Submission 
�. Submit only the file "todo.cpp" through CANVAS to "Assignment 4". The filename has to be exactly the same and you should NOT zip/compress 
the file. Canvas may append a number to the filename of the file you submit. E.g., todo-1.cpp. It is OK as long as you have named your file as 
todo.cpp when you submit it. 
�. Make sure your source file can be successfully compiled. If we cannot even compile your source file, your work will not be graded. Therefore, you should at 
least put in dummy implementations to the parts that you cannot finish so that there will be no compilation error. 
�. Make sure you actually upload the correct version of your source files - we only grade what you upload. Some students in the past submitted an 
empty file or a wrong file or an exe file which is worth zero mark. So you should download and double-check the file you submit. 
�. You may submit your file multiple times, but only the latest version will be graded. 
�. Submit early to avoid any last-minute problem. Only canvas submissions will be accepted. 
Compilation Requirement 
It is required that your submissions can be compiled and run successfully in our official Windows Eclipse which can be downloaded from the "Using Eclipse at 
home (Windows)" section here. You need to unzip the zip file first, and then run the Eclipse program extracted from it. Do not just double-click the zip file. If you 
have used other IDE/compiler/OS (including macOS Eclipse) to work out your solution, you should test your program in the aforementioned official environment 
before submission. This version of Eclipse is also installed on our lab machines. 
If you have no access to a standard Windows machine, you may remote control a Windows machine in HKUST virtual barn. Choose the "Programming Software" 
server, and you will find Eclipse shortcut on the desktop. This is a newer version of Eclipse with a slightly different interface. However, its compiler has the same 
version as ours and can be used to verify if your program can be compiled by us for grading. In particular, to create a new C++ project, you can do so from "File" 
menu -> "New" -> "Project..." -> "C++ project" -> Type a project name -> Choose MinGW compiler -> "Finish". Also, to build the project, save your files, then use 
the Hammer and Run buttons, circled in the following screenshot (NOT the ones on the left): 
Late Submission Policy 
There will be a penalty of -1 point (out of a maximum 100 points) for every minute you are late. For instance, since the deadline of assignment 3 is 23�59�00 on 
May 12, if you submit your solution at 1�00�00 on May 13, there will be a penalty of -61 points for your assignment. However, the lowest grade you may get from an 
assignment is zero: any negative score after the deduction due to late penalty will be reset to zero. However, all submissions will be subject to our plagiarism 
detection software, and you will get the cheating penalty as described in Honor Code if you cheat and get caught. 
Checking for memory leak 
When we grade your work, we will use valgrind to check for memory leak in your program. Some marks will be deducted if memory leak is detected. 
Checking for memory leak is optional for you. Please refer to the same section on PA3 webpage. 
FAQ 
Q: My code doesn't work / there is an error, here is the code, can you help me to fix it? 
A: As the assignment is a major course assessment, to be fair, you are supposed to work on it on your own and we should not finish the tasks for you. We 
might provide some very general hints to you, but we shall not fix the problem or debug for you. 
Q: Do I need to clear already existing CSV files at the start of my program? 
A: For the convenience of youself and us, you are recommended to do it. But it doesn't matter to the grade. 
Q: I get some numbers whose last digit is are slightly different from sample results. How can I handle with this? 
A: This may be caused by different compilation environment. Please run your program on the virtual barn. 
联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!