Assignment 1:Web Scraping& DataAnalysis
Oct 07, 2023
In this assignment, you should work with data from http://quotes.toscrape.com/
(Online Quotes)
Quotes to Scrape is a digital haven for those seeking a daily dose of inspiration, wit, and
wisdom. This website is a treasure trove of thought-provoking and memorable quotes from
a wide range of authors, thinkers, and personalities. Whether you're looking for words of
wisdom to brighten your day or seeking quotes to add depth to your writing, Quotes to
Scrape has you covered.
At Quotes to Scrape, you'll find an extensive collection of quotes that cover a multitude of
topics, from love and life to success and motivation. Each quote is accompanied by the
name of the author, allowing you to explore the perspectives of various individuals who
have contributed to the world of literature and philosophy. In this project, we will use the
requests and regular expressions we have learned before to scrape all Quotes from the
provided website.
Task1. You are required to scrape 100 Quotes (all Quotes) from the website and save result
into ‘your_name+id.csv’. This file should contain data with the following columns: (40
marks)
Quotes Content 5 marks
Tags 5 marks
Author 5 marks
Birthday of the author 5 marks
Country of the author 5 marks
Description of the author 5 marks
Quality of the code (10 marks)
You are encouraged to explore data with more properties if needed.
Task2. You are required to do a data analysis on the data. What do you think is interesting
about this data? Tell a story about some interesting thing you have discovered by looking
at the data. (60 marks)
For example, which quotes you might read? Does the type of tags affect quotes you might
read? Who is your favorite author?
Note: This is an open topic project. You are required to provide a novel topic and
demonstrate your hypotheses (view points) with data analysis and figures illustrations.
The reports and running code (web scraping + data analysis) should be submitted using
Jupter Notebook file.
Submission Checklist:
Yes/No Items
Jupyter Notenook code
your_name+id.csv
Marking Guidelines
Marking Criteria
Idea (5 marks) Presents a novel idea
Clearly demonstrate your viewpoints.
Demonstrates good understanding of
the topic.
Discussion (30 marks) Provide convincing arguments to
your viewpoints.
Backs up arguments with appropriate
data analysis results.
Visualize data analysis results by
using more than 5 figures.
Organization (20 marks) Use of figures to support ideas
discussed in the report.
The quality of the figures.
These figures should be informative.
Use of sub-titles and/or clear topic
sentences.
Use multiple visualization methods
(line, bar, pie chart, etc, ).
Writing Style (5 marks) Concise writing style
Strong scientific writing without
grammatical errors.