MUSA 650 Homework 2: Supervised Land Use Classification with Google Earth Engine
In this assignment, you will use Google Earth Engine via Python to implement multi-class land cover classification. You will hand-label Landsat 8 satellite images which you will then use to train a random forest model. Along the way, you will consider practical remote sensing issues like cloud cover, class imbalances, and feature selection.
Given that hand-labeling data can be time-consuming, you are encouraged to work in pairs or groups of three to share the workload. You may collaborate on generating the hand-labeled data, provided you submit separate assignment files. If you choose to do this, you should all use the same ROI, of course.
You are responsible for figuring out the code independently and may refer to tutorials, code examples, or use AI support, but please cite all sources.
In particular, we encourage you to consult the official Python Google Earth Engine geemap package, the online course Spatial Thoughts, and the Google Earth Engine Tutorials book.
Submit a single Jupyter Notebook containing code, narrative text, visualizations, and answers to each question. Please also upload your classification results as a GeoTIFF and your accuracy assessment as a CSV file. Open a pull request from your fork of this repository to the main repository for submission.
1. Setup
For this assignment, you will define the region of interest (ROI) of your choice. We recommend picking an urban area large enough that you will have a sufficient sample size but not so large that it will take an excessively long time to process.
You'll also use Landsat 8 satellite imagery from USGS for this assignment. Choose images from 2023, filtering for images with minimal cloud cover.
2. Data Collection and Feature Engineering
2.1 Collecting and Labeling Training Data
Using the interactive geemap intereface or another approach (e.g., QGIS, ArcGIS, a GeoJSON file, etc.), create at least 100 samples (points or polygons) for each of the following four classes: urban, bare, water, and vegetation. (Again, we encourage you to work in pairs or groups of three to generate these hand labels.) Use visual cues and manual inspection to ensure that the samples are accurate. Assign each class a unique label (e.g., 0 for urban, 1 for bare, 2 for water, and 3 for vegetation) and merge the labeled samples into a single dataset. You are free to propose any labels you like, as long as 1) you include at least 4 classes, and 2) you justify why they are appropriate for a remote sensing task (for example, including a label for ice cream shops wouldn't make sense, because those can't be detected from aerial imagery).
2.2 Feature Engineering.
For possible use in the model, calculate and add the following spectral indices:
● NDVI (Normalized Difference Vegetation Index)
● NDBI (Normalized Difference Built-up Index)
● MNDWI (Modified Normalized Difference Water Index)
Additionally, add elevation and slope data from a DEM. Normalize all image bands to a 0 to 1 scale for consistent model input.
For bonus points, consider adding kernel filters (e.g., edge detection, smoothing) to see if they improve model performance.
3. Model Training and Evaluation
3.1 Model Training
Split your data into a training dataset (70%) and a validation dataset (30%). Train and evaluate a random forest model using the training set with all engineered features.
After training, analyze variable importance scores to justify each feature's inclusion. Identify which features are most influential in the classification. Report the final features that you keep in your model.
3.2 Accuracy Assessment
Use the trained model to classify the Landsat 8 image, creating a land cover classification map with classes for urban, bare, water, and vegetation (or whatever classes you have chosen).
Using the validation data, generate a confusion matrix and calculate the overall accuracy, precision, and recall. Which classes were confused most often with each other? Why do you think this was?
Visually compare your landcover data for your ROI with the corresponding landcover data from the European Space Agency. Do your classifications agree? If not, do you notice any patterns in the types of landcover where they differ, or any particular features in the imagery that are hard for your model to recognize (e.g., sand, water, or asphalt)?
Export the classified image as a GeoTIFF and the confusion matrix and accuracy metrics to a CSV file for documentation.
4. Reflection Questions
What limitations did you run into when completing this assignment? What might you do differently if you repeated it, or what might you change if you had more time and/or resources?
What was the impact of feature engineering? Which layers most contributed to the model? Did you expect this? Why or why not?
Did you find it difficult to create the training data by hand? Did you notice any issues with class imbalance? If so, how might you resolve this in the future (hint: consider a different sampling technique).
Did your model perform. better on one class than another? Why? Can you think of a reason that this might be good or bad depending on the context?