#
代写SSCI 599课程、代做R语言程序、R编程设计调试
代写R语言程序|代写Web开发

SSCI 599 – Spatial Topic: Spatial Econometrics Project 2

USC Spatial Sciences Institute © 2020 1

SSCI 599 Project 2 – Explanatory Spatial Data Analysis & Multiple

Linear Regression

Prepared by An-Min Wu, PhD, Lecturer of Spatial Sciences, University of Southern California

Due Date: Monday, October 26, 11:59 pm Pacific Time

Submit Project 2 as a Word document into the corresponding assignment link on Blackboard

Value 7% of the course grade

Penalty for late delivery: 2 points deduction up to 4 days late; no points will be given over 4 days

late.

The purpose of this project is for you to apply the concepts and skills learned in the classes to

the datasets that you are interested in exploring in spatial economics. As you have done some

preliminary research work in Project 1, I hope the data comes handy for you to dive into the

analysis in this project.

In this project, you will import the data of your own interest in spatial economics into R and

conduct explanatory data analysis, exploratory spatial data analysis (including kernel density

estimation, spatial weights and global spatial autocorrelation) and multiple linear regression.

Before going into spatial data analysis, read through the entire document first. Next, go through

the hands-on R practices that we did in Week 6-7 if you have not done so, so you are familiar

with the libraries, functions and their arguments required in R to complete this project.

Learning Objectives

• To identify available spatial datasets for investigating the spatial economic topic area of

your interest

• To use kernel density estimation, global Moran’s I, and Moran scatterplot

• To conduct multiple linear regression including the pre- and after-assessments of the

dataset for its fit-for-use in regression analysis

• To interpret the outputs of kernel density estimation, spatial autocorrelation and multiple

linear regression

Assignment Description

This project looks to further your topic of interest into some practical exercises in spatial and

statistical analysis in R. To complete that, follow the instructions below:

1. From your chosen spatial economic research topic and variables in Project 1, identify spatial

datasets of the variable(s) for investigation in spatial analysis and import the data into R.

Focus on the main variable that you are interested in learning to start with. Consider the

spatial extent and unit of analysis so the data size is not too large to manage (e.g. the number

of units is greater than 50 units and not more than 500 units). Sometimes your spatial

SSCI 599 – Spatial Topic: Spatial Econometrics Project 2

USC Spatial Sciences Institute © 2020 2

location data (e.g. county boundaries) and attributes (e.g. employment rates) might need to

come from different sources and join together before using. Do the pre-processing in Excel

or ArcGIS as needed (We will cover how to do that in R soon).

For importing shapefiles, use readOGR( ) in the rgdal package. Use ??readOGR to open the

help file in RStudio. If your data is not projected, you will have to retrieve the geographic

coordinates from polygons then use the spTransform method in the rgdal library.

If your non-spatial attribute table contains latitude and longitude, you can use read.csv( ) or

read.table( ) to import the non-spatial data first, then make your data spatial by creating a

Spatial* object (see the Week 4 handout for how to promote the data spatial).

For any question about import here or the remaining of the project, I would suggest you to

search for online resources (e.g. https://rdocumentation.org) and post your question/issues

on the Discussion Forum on Blackboard.

2. Explore the imported data distribution first by conducting explanatory data analysis (EDA)

in R. For running any statistical or spatial analysis, you should always examine your data first.

Run descriptive statistics (the R function should be the one that shows at least: sample size,

minimum, mean, median, maximum, standard deviation) and make a scatterplot, a

histogram, and a boxplot for your main variable(s) – doing all EDA here for one variable is

sufficient, but more is fine (e.g. running EDA for both variables that you want to know the

association between the two). Consider transformation if the data shows non-normal

distribution and show its normality after transformation.

3. Explore the imported spatial data by conducting explanatory spatial data analysis (ESDA) of

your main interested variable(s). including kernel density, building spatial weights matrix

followed by global Moran’s I and Moran Scatterplot.

4. Execute standard linear regression to investigate the association of the variables in the topic

of interest using lm( ) function. The number of independent variable can vary but make sure

that your final model contains only the explanatory (a.k.a independent or predictor) variables

that have their partial coefficients statistical significant.

5. Write a report that include the following items:

• Introduction: A brief description of your interested spatial economic topic, the variables

you selected (including unit of analysis and spatial extent), and the sources where you

find the data (include the organization that you obtained the data and their URL if

available).

• EDA: R code, their resulting table/plots, and a short paragraph describing othe

distribution (i.e. central tendency and dispersion) of the data and if you performed

transformation or not.

• ESDA: R code, their resulting display, and 1-2 descriptive paragraphs that interpret the

results. Here your results should consist of the KDE map, neighbor list object detail,

visualization of your spatial weights objects, Moran’s I results, and Moran scatterplot.

Whether you run Moran’s I using Monte Carlo approach is your choice. Describe what

each of these analysis results tells you about your data.

SSCI 599 – Spatial Topic: Spatial Econometrics Project 2

USC Spatial Sciences Institute © 2020 3

• Standard linear regression: R code, their results, and a paragraph that interpret the

results.

• Reflection: A short paragraph (less than 200 words) reflect about the experience you

had when working on this project. What do you find easy? What do you find

challenging? What questions do you still have after you complete the project? Any

adjustment you might consider, either on data or operation, to improve your experience?

Deliverables

Submit a project report with the components requested above in a Word document. Include a

cover page that contains at least the information about the class number (SSCI 599), semester

(Fall 2020), project number/title and your name. Save your Project 2 report document as

Project2_[YourLastName].docx and submit it via the appropriate assignment link in Blackboard.

Additional Resources I: Data Hubs

If you have a hard time to find the appropriate datasets, you may consider to use the following

sources and adopt the datasets mentioned here to use in your project.

1. City of Los Angeles GeoHub: https://geohub.lacity.org. Datasets you may consider to

use include, but not limited to, Los Angeles index of displacement pressure, traffic

collision or traffic accidents data.

2. COVID-19 GIS Hub: https://coronavirus-resources.esri.com. If you are interested in

understanding COVID-19 impact of our social and economic aspects of life, you might

find this data hub useful. Additionally, as I want you to make a story map for the final

presentation that combines the analysis and information for all of your projects this

semester, you might also check out how Esri utilizes its ArcGIS Story Map to tell the

story of its work in COVID-19 (https://esri.com/about/newsroom/blog/gis-toachieve-equitable-speedy-vaccine-distribution)

3. The U.S. Census’s American Community Survey 5-year Data:

https://census.gov/data/developers/data-sets/acs-5year.html. The Census Bureau not

only offers spatial data (TIGER/Line data), but also include various socio-economic and

demographic factors that are surveyed every year in various census administrative levels

you can download for use.

4. IPUMS: https://ipums.org. As a part of the Institute for Social Research and Data

Innovation at the University of Minnesota, IPUMS provides census and survey data

from the U.S. and around the world. IPUMS integrates the census type data to make it

easy to study and research. For your information, you may also want to check the

‘ABOUT’ tab if you look for the data analysis type of employment in the near future.

Additional Resources II: Creating neighbor object list for a point data

Assume that we want to explore a dataset that contains three columns including latitude, longitude

and the average math score of schools in one district. We can import this data (.csv), transform it

to a spatial object (sp), and assign its datum WGS84:

SSCI 599 – Spatial Topic: Spatial Econometrics Project 2

USC Spatial Sciences Institute © 2020 4

To create an object that describes the neighbor relationship from the point data, here we use the

k nearest neighbor (knn) method:

The resulting neighbor object is a knn class of object. Now we can convert the knn object into a

more generic class of neighbors nb:

Now you can convert the nb object to the listw object using nb2listw.