首页 >
> 详细

GGR376

Assignment 2: Regression

44 Marks

Regression: Modelling the relationship between a response (or dependent variable) and one or

more explanatory variables (or independent variables). linear regression is a linear approach to

modelling the relationship.

Before completing the assignment, review the example R markdown tutorial and the videos.

NOTE: Join the spatial data at the beginning, as it causes issues to do it at the end.

Research Problem:

Produce an explanatory regression model for the variation in housing costs by census tract in the

City of Hamilton, Ontario, Canada.

Data:

Hamilton Census Tract boundaries, which includes the average house price and the unique

identifier: CTUID.

You can access the data with the following command and URL:

library(rgdal)

rgdal::readOGR("https://raw.githubusercontent.com/gisUTM/GGR376/master/Lab_1/houseValu

es.geojson")

You will need to obtain 10 potential explanatory variables from the 2016 Census Data, available

from CHASS: http://dc2.chass.utoronto.ca.myaccess.library.utoronto.ca/census/

Assignment Format:

The assignment submission will be composed of three files.

1. An R script of your code produced during the project, with the .R file extension.

2. A CSV file of the additional input data you utilized in your model (one table).

3. Answers to the questions listed below in a PDF file.

All three files must be submitted online.

Assignment Requirements:

• Ensure all procedures from the lab tutorial are replicated in your work.

• Fit and test 10 linear regression models.

o Example model names: model_1, model_2, etc.

o All models should remain in the code.

o Rename your final model: final_model

• The final model must meet all assumptions with the possible exemption:

o Independent errors due to spatial autocorrelation.

▪ Validate the independent errors assumption in your model with spatial

autoregressive modelling.

GRADING

R Script: 10 Marks

The script you submit should be fully reproducible, which means the TA should be able to run

your script without modification. The only allowable modification would be the file path for the

CSV file of your additional input variables. Review the R Script grading scale below.

The general structure of your R script should follow:

1. Data Munging:

a. Reading Data

b. Merging Data

2. Graphical Analysis Pre-Check

3. Data Transformations

4. Correlation Assessment

5. Model Fitting and model assumption assessment (10 models)

a. If one assumption is broken you can continue to the next model.

i. No need to test every assumption in that case

6. Spatial Autocorrelation Assessment

7. Spatial Autoregressive Modelling

R Script Grading:

10 / 10: The code is properly documented with comments and detailed variable names. No issues

are present in the code. A person versed in R should be able to read through the code in one

attempt.

9 / 10: The code is well documented. A single error, inconsistency, poor variable name or

documentation is present. A reviewer may need to make a single check of previous code to

interpret.

8 / 10: The code is documented. A couple errors, inconsistencies, poor variable names or

documentation is present. A reviewer may need to make multiple checks of previous code to

interpret.

7 / 10: The code is documented. A few errors, inconsistencies, poor variable names or

documentation is present. A reviewer needs to make multiple checks of previous code to

interpret but can understand all sections of the code.

6 / 10: The code is partially documented. Errors, inconsistencies, and poor variable names are

present. A reviewer needs to make multiple checks of previous code to interpret and may not

completely understand all sections of the code.

5 / 10: The code is sparsely documented. Many errors, inconsistencies, and poor variable names

present. A reviewer needs to make multiple checks of previous code to interpret and does not

completely understand all sections of the code.

4 or below: Many inconsistences in the code. It would not be able to be reproduced by another

researcher without many questions directed to the original author.

Missing assignment requirements in the code will also reduce your mark.

• Too few linear models in the code (-1 for each missing model)

• Final_model is not renamed (-1)

• Model Assumptions not tested (-1 for each assumption)

• Moran’s I not tested correctly (-2)

• Code will not run when tested (-3)

• Other errors will be penalized as appropriate.

To achieve a mark above 8, it is likely you would re-write your code after you have completed

working through the assignment to ensure clarity.

CSV File: 2 Marks

The CSV file should contain all the variables that you obtained from the Census for testing in

your model. It must contain 10 variables.

Questions (32 Marks)

All figures must include a figure caption.

1. Complete the following table. (1 Mark)

Variable Name

in CSV File Min Max Mean Variable Description

2. Complete the following table. (2 Marks)

Variable Name

in CSV File Reason why you selected the variable.

3. Produce a publication quality histogram of the dependent variable (transformed if you did a

transformation). (3 Marks)

4. Write 50 words on why you did or did not transform your dependent variable based on the

assumptions of the linear regression model. (2 Marks)

5. Describe in 200 words your process of model fitting. Address the selection of variables, how

you decided to remove or add variables, and the way you assessed each assumption. (4 Marks)

6. Complete the following table (2 Marks)

Model

Name R

2

p < 0.05

(Y/N) List Assumption(s) Violated or All assumptions met?

7. For your final linear regression model, produce a figure from the 4 plots generated by

plot(linear_model). (2 Marks)

8. Produce a publication quality figure of residuals vs fitted values for your final linear

regression model. (3 Marks)

9. Calculate Moran’s I for your residuals. Report in 50 words, your values for Moran’s I and how

you interpret these findings. (3 Marks)

10. Write 150 words interpreting your final linear regression model. (4 marks)

11. Would you require a spatial autoregressive model? Explain how you would have chosen the

model to use. (3 marks)

12. Produce a map of a spatial autoregressive model’s residuals. (3 Marks)

联系我们

- QQ：99515681
- 邮箱：99515681@qq.com
- 工作时间：8:00-23:00
- 微信：codehelp

- Stat7017 Final Project 2020-03-29
- Cs3214 Spring 2020 Project 1 - “Extens 2020-03-29
- Co3090/Co7090 Distributed Systems And ... 2020-03-29
- Hw2: Sql 2020-03-29
- Hw1: 5 Points Entity-Relational (Er) 2020-03-29
- Math 104A Homework #3 2020-03-29
- Comp 250 Assignment 2 2020-03-29
- Cs 570课程作业代写、Program作业代做、C++语言作业代写、代做j 2020-03-29
- Comp-424作业代做、代写intelligence作业、Python，C 2020-03-29
- Database作业代做、代写cap Theorem作业、代写java程序语 2020-03-29
- 代做structure作业、代写python，Java,C++编程语言作业、 2020-03-29
- 代写sta238留学生作业、代做python，C++程序语言作业、Java编 2020-03-29
- Csc148留学生作业代做、代写computer Science作业、Pyt 2020-03-29
- Cmpt 365作业代做、代写programming作业、代做java，C+ 2020-03-29
- Fc712留学生作业代做、代写programming课程作业、代写pytho 2020-03-28
- Algorithms作业代写、代做dataset课程作业、C++，Pytho 2020-03-28
- 代做data留学生作业、代写r编程设计作业、代做r语言作业、代写progra 2020-03-28
- Csci3130作业代写、代做uml留学生作业、Python，C++，Jav 2020-03-28
- Eece5644作业代做、Matlab语言作业代做、代写matlab程序设计 2020-03-28
- 代写comp9321作业、代做python编程设计作业、代写python语言 2020-03-28