Service Sector Summary
SCM330: SS24
2025-02-03
Assignment Introduction
You will develop a data-driven summary of a service sector. The deliverables are a written report and a presentation during the last week of class (or finals week). We may also have mini-updates throughout the course as the schedule allows.
The purpose of this assignment is two fold:
(1) To give you practice scraping internet data and expose you to the inherent challenges of analyzing raw data — most of the rest of your data given to you for cases is clean.
(2) To expose the class to the differences between service sectors.
To accomplish this, you will summarize a service sector of your choice using (primarily) data from the Beareau of Labor Statistics (BLS). I will provide a survey for you to indicate your preferences as to which service sector you’d like to study. I will give preference to those who have, are, or will be employed in a specific service sector (below the subsector).
The BLS classifies services into the following supersectors and sectors as shown here and listed below (note that you should choose a subsector with a 3 number NAICS code, e.g., Truck Transportation: NAICS 484):
• Trade, Transportation, and Utilities
– Wholesale Trade (NAICS 42)
– Retail Trade (NAICS 44-45)
– Transportation and Warehousing (NAICS 48-49)
– Utilities (NAICS 22)
• Information
– Information (NAICS 51)
• Financial Activities
– Finance and Insurance (NAICS 52)
– Real Estate and Rental and Leasing (NAICS 53)
• Professional and Business Services
– Professional, Scientific, and Technical Services (NAICS 54)
– Management of Companies and Enterprises (NAICS 55)
– Administrative and Support and Waste Management and Remediation Services (NAICS 56)
• Education and Health Services
– Educational Services (NAICS 61)
– Health Care and Social Assistance (NAICS 62)
• Leisure and Hospitality
– Arts, Entertainment, and Recreation (NAICS 71)
– Accommodation and Food Services (NAICS 72)
• Other Services (except Public Administration)
– Other Services (except Public Administration) (NAICS 81)
Report Instructions
Your analysis report and presentation must be reproducible. That means it must be completed in code (not excel) and you must access the BLS data through the API (see instructions below). This can be done on DataCamp datalab or through an interactive notebook (Rmarkdown, Quarto, Jupyter Notebooks, etc.). I will run your code to and build your analysis from scratch.
The minimum requirements — completing all of which will earn an undergraduate a “B” — include:
• Identify the primary customer inputs for businesses in this sector and classify them according to the UST
• Download and analyse employment and wage data for your subsector
• Run and interpret at least 1 regression, including assessment of assumptions
• Make at least 1 custom visualization
• Summarize some key recent news or trends in your subsector.
If you are a graduate student, you have an additional task to achieve a “B” .
• If you are an MSBA student, you must apply an analytical method not discussed in this class. You will explain this method to the class in your presentation.
• If you are an MBA student, you must profile at least 1 company in the sector including a summary of their competitive strategy.
All students desiring an “A” should bring in additional context using extra data. There many different datasets on the BLS website alone, not to mention other government data. News articles often cite government data in the captions of figures if you need inspiration. You also have free access to Statistica and many other data sources because Lehigh pays for them (Note: you probably need to be on Lehigh WiFi or the VPN to gain full access). Other open source data API’s can be found on CRAN/Task Views/OfficialStatistcs. As an example, if studying Accommodation and Food Services sector, you could use data from Statistica to analyse the change in customer satisfaction index scores of Starbucks.
TRAC Fellows
You are fortunate to have access to TRAC Fellows for this assignment. Since this is a semester long project with some open ended requirements, you can take this many different directions. The TRAC Fellows are trained to help you with the precise kind of assignment. Please check the Syllabus for more details about the TRAC program, what the fellows will and will not do, and how to get the most out of your interactions.
Example BLS Data Pull
I’m going to look at the employment in the Other Services (except Public Administration) sector. To get this data, I first need to install the package. Documentation about the package can be found here.
install.packages( ' blsR ' )
I then need to load the package (I’m also loading the tidyverse for analysis purposes).
library(blsR)
library (tidyverse)
Take a look at the instructions for the main function we’ll use to query the BLS API by running the following command:
?get_series_table
You’ll notice we need to provide a few pieces of information to the function: series id, api key, and although optional, start and end years. Lets start by setting our API key for the session. You will need to get your own API key by following this link.
# Manually set API key for session
bls_set_key ( "enter_your_key_as a character")
One thing to note is that the BLS API limits the number of years within a time series to 20 years. If you try more than 20 years, you’ll get an error.
So, if the time series we’re interested in is greater than 20 years, we can build our own function to interactively query the API on 20 year increments by breaking our time series into chunks of start and end years. We can then compile a cumulative data frame using a function from the purrr package, which is a part of the Tidyverse.
# Create function to iterate the API query
bls_query <- function (start, end){
df <- get_series_table (
series_id = series,
api_key = bls_get_key (), start_year = start,
end_year = end) | >
arrange (year, period)
return (df) }
Now that we’ve specified the query function, here’s how to get employment data from the Other Services (except Public Administration) sector for the years 1939 to 2023:
# Time series information
series <- ' CES8000000001 ' time_series_start <- 1939 time_series_end <- 2023
# Total time series
timeseries <- seq (time_series_start, time_series_end, by = 1)
# Split the time series into increments of 20 years
increments <- split (timeseries, ceiling(seq_along(timeseries) / 20))
# Set start and end year for the increments
start_years <- sapply (increments, head,1); print (start_years)
## 1 2 3 4 5 ## 1939 1959 1979 1999 2019
end_years <- sapply (increments, tail,1); print (end_years)
## 1 2 3 4 5 ## 1958 1978 1998 2018 2023
# Compile a data frame. for the time series
OtherServices_employment_seasonallyAdj <-
map2_dfr (start_years, end_years, bls_query)
To get employment data from the Other Services (except Public Administration) sector on the BLS website, I start at the BLS website for service sectors, then click on the Other Services (except Public Administration) sector link, and then “Back Data” button next to Employment, all employees (seasonally adjusted).