ACADEMIC讲解 、辅导 SQL设计编程
College of Arts, Technology and Environment
ACADEMIC YEAR 2023/24
Assessment Brief
Submission and feedback dates
Submission deadline: Before 14:00 on 18/01/2024
This is an individual assessment task eligible for a 48 hour late submission window.
Marks and Feedback due on: 14/02/2024
N.B. all times are 24-hour clock, current local time (at time of submission) in the UK
Submission details:
Module title and code: UFCFLR-15-M Data Management Fundamentals
Assessment type: Database Design and Implementation Task
Assessment title: Modelling & Mapping Bristol Air Quality Data
Assessment weighting: 50% of total module mark
Size or length of assessment: N/A
Module learning outcomes assessed by this task:
Main Learning Goals & Outcomes (from the Module Specification)
oUnderstand and use the relational model to structure data for efficient and effective storage and retrieval.
oDesign, develop and validate a range of data models and schemas.
oUnderstand, evaluate and apply a range of data query and manipulation languages and frameworks.
Additional Learning Outcomes (from the Module Specification)
oConstructing and reverse-engineering entity relationship models.
oUnderstanding and applying data normalisation.
oNoSQL [data formats and understanding the difference] to RDBMS.
oLearn and use the MARKDOWN markup syntax.
Assignment background & context
Measuring Air Quality
Levels of various air borne pollutants such as Nitrogen Monoxide (NO), Nitrogen Dioxide (NO2) and particulate matter (also called particle pollution) are all major contributors to the measure of overall air quality.
For instance, NO2 is measured using micrograms in each cubic metre of air (㎍/m3). A microgram (㎍) is one millionth of a gram. A concentration of 1 ㎍/m3 means that one cubic metre of air contains one microgram of pollutant.
To protect our health, the UK Government sets two air quality objectives for NO2 in their Air Quality Strategy
1.The hourly objective, which is the concentration of NO2 in the air, averaged over a period of one hour.
2.The annual objective, which is the concentration of NO2 in the air, averaged over a period of a year.
The following table shows the colour encoding and the levels for Objective 1 above, the mean hourly ratio, adopted in the UK.
Index 1 2 3 4 5 6 7 8 9 10
Band Low Low Low Moderate Moderate Moderate High High High Very High
㎍/m³ 0-67 68-134 135-200 201-267 268-334 335-400 401-467 468-534 535-600 601 or more
Further details of colour encodings and health warnings can be found at the DEFRA Site.
The Input Data
The following ZIP file provides data ranging from 1993 to 22 October 2023 taken from 19 monitoring stations in and around Bristol.
Download & save the data file: Air_Quality_Continous.zip (23.2 Mb)
Create a directory (folder) called “data” on your working machine and unzip the file there to Air_Quality_Continuous.csv (112 Mb).
Monitors may suffer downtime and may become defunct, so the data isn’t always complete for all stations.
Shown here is the first 8 lines of the file (cropped):
Note the following:
There are 19 stations (monitors):
188 => 'AURN Bristol Centre', 51.4572041156,-2.58564914143
203 => 'Brislington Depot', 51.4417471802,-2.55995583224
206 => 'Rupert Street', 51.4554331987,-2.59626237324
209 => 'IKEA M32', 51.4752847609,-2.56207998299
213 => 'Old Market', 51.4560189999,-2.58348949026
215 => 'Parson Street School', 51.432675707,-2.60495665673
228 => 'Temple Meads Station', 51.4488837041,-2.58447776241
270 => 'Wells Road', 51.4278638883,-2.56374153315
271 => 'Trailer Portway P&R', 51.4899934596,-2.68877856929
375 => 'Newfoundland Road Police Station', 51.4606738207,-2.58225341824
395 => "Shiner's Garage", 51.4577930324,-2.56271419977
452 => 'AURN St Pauls', 51.4628294172,-2.58454081635
447 => 'Bath Road', 51.4425372726,-2.57137536073
459 => 'Cheltenham Road \ Station Road', 51.4689385901,-2.5927241667
463 => 'Fishponds Road', 51.4780449714,-2.53523027459
481 => 'CREATE Centre Roof', 51.447213417,-2.62247405516
500 => 'Temple Way', 51.4579497132 ,-2.5839890903
501 => 'Colston Avenue', 51.4552693827,-2.59664882855
672 => 'Marlborough Street', 51.4591419717,-2.59543271836
These monitors are spread across the four City of Bristol constituencies represented by the following Members of Parliament (MP's):
oBristol East - Kerry McCarthy (MP);
oBristol Northwest - Darren Jones (MP);
oBristol South - Karin Smyth (MP); &
oBristol West - Thangam Debbonaire (MP).
Each line represents one reading from a specific detector. Detectors take one reading every hour. If you examine the file using a programming editor, (Notepad++ can handle the job), you can see that the first row gives headers and there are another 1603492 (1.60 million+) rows (lines). There are 19 data items (columns) per line.
The schema for data (what each field represents) is given below:
measure desc unit
Date Time Date and time of measurement datetime
SiteID Site ID for the station integer
NOx Concentration of oxides of nitrogen ㎍/m3
NO2 Concentration of nitrogen dioxide ㎍/m3
NO Concentration of nitric oxide ㎍/m3
PM10 Concentration of particulate matter <10 micron diameter ㎍/m3
O3 Concentration of ozone Concentration of non - volatile particulate matter <10 micron diameter ㎍/m3
Temperature Air temperature °C
ObjectID Object (?) Integer
ObjectID2 Object (?) Integer
NVPM10 Concentration of non - volatile particulate matter <10 micron diameter ㎍/m3
VPM10 Concentration of volatile particulate matter <10 micron diameter ㎎/m3
NVPM2.5 Concentration of non volatile particulate matter <2.5 micron diameter ㎍/m3
PM2.5 Concentration of particulate matter <2.5 micron diameter ㎍/m3
VPM2.5 Concentration of volatile particulate matter <2.5 micron diameter ㎍/m3
CO Concentration of carbon monoxide ㎎/m3
RH Relative Humidity %
Pressure Air Pressure mbar
SO2 Concentration of sulphur dioxide ㎍/m3
Completing your assessment
What am I required to do on this assessment?
This is an individual assessment task requiring you to design, implement and populate a relational DB (MySQL) using open data (pollution levels in Bristol).
You are then required to design and run several SQL queries against the extracted (cropped) data set.
Additionally, you are required to produce a report (in markdown format) describing the research undertaken, a prototype implementation (using a small sample of the dataset) and at least one example query in the NoSQL database of your choice. This report should also discuss the use cases and justification of using de-normalised (NoSQL) data models in contrast to normalised (relational) data models.
Finally, you should produce a short report (less than 600 words and again in markdown format) explaining the overall process undertaken, any issues and resolutions and the learning outcomes you have achieved.
Your submission should consist of a single ZIP file dmf-assign.zip containing all files and the two reports as specified in this brief.
Where should I start?
This assignment consists of seven tasks. This is the task breakdown:
Task 1: Organize and model the data (10 marks):
Group the detectors by constituency and design a normalised Entity Relationship (ER) model which models all the data items.
Note that this model should be a "no loss" model - that is, with the required entities holding all the attributes from all the derived entities.
All relationships should be clearly defined and enumerated.
Submission file: An ER diagram pollution-er.png.
Task 2: Forward engineer the ER model to a MySQL database (10 marks):
Using MySQL Workbench and/or PhpMyAdmin, create the required tables and fields to hold the data. All primary and foreign key attributes should be defined, and all fields should have the appropriate (required) data type.
Submission file: A download of a SQL file as pollution.sql showing all table and attribute definitions.
Task 3: Crop and cleanse the data (10 + 6 marks):
i) Crop the dataset to hold only the data from 1st January 2015 on; (5 marks);
ii) Cleanse the cropped dataset to ensure that all dates fall between 1st January 2015 and 22nd October 2023. (5 marks)
An extra 6 marks are available if you can accomplish the above two tasks using PYTHON code.
Submission file/s: A ZIP file cropped.zip holding the cropped and cleansed data. Additionally and possibly, a PYTHON script called cropped.py that accomplishes the above tasks.
Task 4: Populate the MySQL database tables with the extracted/reduced dataset created in the previous task (10 + 6 marks):
USE PhpMyAdmin’s “import CSV” feature or MySQL's “LOAD DATA INFILE” statement to import the cropped & cleansed dataset into the MySQL tables implementation completed in Task 2 (10 marks).
You can make use of the following guides:
- Import CSV file data into MySQL table with phpMyAdmin;
- Import CSV File Into MySQL Table.
An extra 6 marks are available if you can accomplish the above data mapping task using PYTHON code.
Submission file/s: A screen capture readings.png showing the first 12 records of the main readings file.
Additionally and possibly, a PYTHON script called import.py that accomplishes the above task.
Task 5: Design, write and run SQL queries (12 marks):
Write and implement (test run) the following four SQL queries:
i) Return the date/time, station name and the highest recorded value of nitrogen oxide (NOx) found in the dataset for the year 2022. (4 marks)
ii) Return the mean values of PM2.5 (particulate matter <2.5 micron diameter) & VPM2.5 (volatile particulate matter <2.5 micron diameter) by each station for the year 2022 for readings taken on or near 08:00 hours (peak traffic intensity). (4 marks)
iii) Extend the previous query to show these values for all stations for all the data. (4 marks)
Model the data for a specific monitor (station) to a NoSQL data model (key-value, xml or graph) to implement the selected database type/product & pipe or import the data.
Submission files: Code listing of the three SQL queries query-a.sql, query-b.sql & query-c.sql.
Task 6: Model, implement and query a selected NoSQL database. (24 marks)
Model the data for a specific monitor (station) to a NoSQL data model (key-value, xml, timeseries or graph) to implement the selected database type/product & pipe or import a small sample of the data. You should also implement an example query in your selected database and show the output (screen capture).
You can select from any of the eight databases listed below but if you want, you can also select one not currently on the list (after consultation with the tutor).
Submission file: A report (in markdown format) nosql.md that is less than 1200 words.
Task 7: Reflective Report. (12 marks)
A short report in Markdown format (less than 800 words) reflecting on the assignment tasks, the problems encountered, and the solutions found.
You should also briefly outline the Learning Outcomes you have managed to achieve in undertaking this Assignment.
Submission file: A report (in markdown format) named report.md.
What do I need to do to pass?
The pass mark is 50%.
How do I achieve high marks in this assessment?
We are looking for a well-constructed design transformed into a complete and valid implementation. No PYTHON coding is required to achieve a first-class mark (up to 88%) but if you do want to attempt the PYTHON coding tasks, you can gain an extra 12%. The SQL queries should be functional and return the required results. A first-class attempt will also include two well-constructed reports. The NoSQL task should import a small sample of the dataset and implement at least one query showing the output. This report should outline the design and implementation and include a brief discussion of a normalised (relational) model contrasting it to a de-normalised (NoSQL) model. The final report should reflect on the tasks undertaken, the problems encountered, and the solutions found. You will make use UWE/Harvard referencing if any external resources are referenced.
How does the learning and teaching relate to the assessment?
The lectures and particularly the workshops will guide you on each of design and implementation tasks. All teaching will be completed before the assignment is due for submission.
What additional resources may help me complete this assessment?
You will find relevant material in the lectures and worksheets. You can also make use of LinkedIn Learning for hands on lessons and practice.
What do I do if I am concerned about completing this assessment?
UWE Bristol offer a range of Assessment Support Options that you can explore through this link, and both Academic Support and Wellbeing Support are available.
For further information, please see the Academic Survival Guide.
How do I avoid an Assessment Offence on this module? 2
Use the support above if you feel unable to submit your own work for this module.
Avoid collusion and explain things in your own words (not those of a machine).
Marks and Feedback
Your assessment will be marked according to the following marking criteria.
You can use these to evaluate your own work before you submit.
Criterion <50% 50-59% 60-69% ≧70%
Task 1: Organize and model the data (10%)
Limited and incorrect model that does not capture all the required entities and attributes. Relationships are incorrect.
No proper naming convention adopted.
Adequate model with some minor errors. All entities and attributes are captured. Relationships are as required. A valid and correct model capturing all required entities, attributes and relationships. All attributes are properly named with their required data types.
Optimal model adopting a consistent naming convention. All entities, attributes (with the required data types) and relationships are captured. Relationships are labelled and correctly enumerated.
Task 2: Forward Engineer the ER model to MySQL (10%) Database lacks all required fields and may have missing keys. Relationships are not properly implemented using foreign keys as required. All data has been mapped with the required keys and relationships. There may be minor errors. A good implementation including the required keys and relationships. Data types may not be optimal and have minor anomalies. A complete and valid mapping of the ER model with well named fields and data types. Required relationships are complete and correct.
Task 3: Crop and cleanse the data (10% + 6%) Not all data is cropped and cleaned as required. Data is adequately cleaned overall but may have some minor anomalies (e.g., missed rows). All data is cropped and cleaned as required. A complete cleansing and cropping attempt with all data complete with no missing columns or records. An attempt has been made at the PYTHON code even if not complete.
Task 4: Populate the MySQL database tables (10% + 6%) Not all data is mapped to the database as required. All data has been mapped but may be inconsistent in places due to an inadequate model. All data is mapped to the required tables and all keys are implemented. No missing data and all relationships are realized using foreign keys. All data is accurately mapped to the required tables and all keys are implemented. No missing data and all relationships are realized using foreign keys. An attempt has been made at the
PYTHON code even if not complete.
Task 5: SQL queries Queries are not functional and/or contain errors. Some effort apparent. All queries are included in the submission as required. Queries are functional. Queries return the expected output. SQL queries are commented and functionally complete returning the expected output. SQL queries include comments, are optimized, and work as required. Queries and output (screen captures) are included in the submission.
Task 6: NoSQL implementation and report A sub-optimal design or implementation. Report lacks sufficient discussion and reflection. A reasonable report with an adequate data model. Implementation may have some flaws and the discussion may lack the required detail. A complete data model and NoSQL implementation. Some discussion of normalisation / de-normalisation in their context. A complete and accurate NoSQL implementation with an excellent model and discussion. One or more queries have been implemented showing evidenced output.
Task 7: Reflective report Report lacks sufficient detail and reflection. An adequate report with some discussion of the problems encountered and solutions implemented. A good report with adequate discussion of problems and solutions. Some discussion of learning outcomes. An excellent and complete report with detailed discussion of problems, solutions and the learning outcomes achieved.
1.In line with UWE Bristol’s Assessment Content Limit Policy (formerly the Word Count Policy), word count includes all text, including (but not limited to): the main body of text (including headings), all citations (both in and out of brackets), text boxes, tables and graphs, figures and diagrams, quotes, lists.
2.UWE Bristol’s UWE’s Assessment Offences Policy requires that you submit work that is entirely your own and reflects your own learning, so it is important to:
Ensure you reference all sources used, using the UWE Harvard system and the guidance available on UWE’s Study Skills referencing pages.
Avoid copying and pasting any work into this assessment, including your own previous assessments, work from other students or internet sources
Develop your own style, arguments and wording, so avoid copying sources and changing individual words but keeping, essentially, the same sentences and/or structures from other sources
Never give your work to others who may copy it
If an individual assessment, develop your own work and preparation, and do not allow anyone to amend your work (including proof-readers, who may highlight issues but not edit the work).
When submitting your work, you will be required to confirm that the work is your own, and text-matching software and other methods are routinely used to check submissions against other submissions to the university and internet sources. Details of what constitutes plagiarism and how to avoid it can be found on UWE’s Study Skills pages about avoiding plagiarism.