Exploratory Data Analysis

Code

Data


As this project contains many different time series, not all visualization types were done for all series.

Please see Data Visualization page for basic time series plot.

Unemployment

''

Based on the basic time series plot of unemployment (replicated above, from the Data Visualization page), I used an additive model to decompose the unemployment data. The unemployment data is not supposed to contain seasons, as the data is reported by the BLS as seasonally adjusted. I do think the data has actual cycles (in the length of a decade or so) to match the business cycle. There certainly isn't a consistent linear trend (ie, generally up or generally down) over the course of the data.

It is interesting that the COVID-19 spike partially registered as a 'trend' and partially as random noise in the bottom-most part of the figure.

Consumer Sentiment

Consumer sentiment did not show an obvious seasonality in its basic time series plot, and it definitely didn't look stationary.

This figure shows the lags at different numbers of months. There is clearly a strong relationship in the early lags, which weakens with each passing month.

The absence of a clear relationship at the 12-month lag suggests there is not any sort of annual trend in the data.

Moving Average Smoothing on Consumer Sentiment

The figure below shows the consumer sentiment with moving-average smoothing (in red) for 3-month (quarterly), 6-month (half yearly), 12-month (full year), and 24-month (two year) smoothing. The smoothing of 3 months and six months still shows a lot of volatility. The 12-month smoothing seems very smooth and makes changes in consumer sentiment easier to understand. The 24-month smoothing isn't much different from the 12-month in its general shape.

US Households on Food Stamps (SNAP)

Data on food stamps is remarkably smooth month to month. This shows up in the ACF plot, shown here, in the high lag that persists across a full four years. All of these lags are outside the confidence interval, showing it absolutely is not stationary. Note: I did not remove the 2019 error/ outlier data point when doing these plots & test.

The image below is the partial ACF graph of the food stamps. The first two lags are well outside the confidence interval, but the lag is dramatically reduced after that and fades to zero.

Augmented Dickey-Fuller Test on Food Stamps

I ran an ADF test on the food stamp data, and the resulting p-value was .9428, confirming that the data definitely isn't stationary.

Google Trends on Economic Topics

For space, I am only including visuals on the google trend for 'cheap gas', though I have google trends data on other topics. In this section, I will include the original data, differenced (detrended) data, and the ACF plots of both.

This first plot shows the data with a first and second order differencing. The first order differencing still contains some really significant spikes.

This next figure shows the ACF of the original google trends data for 'cheap gas', along with the ACF plots of the first and second differenced data.

The first ACF shows that the original data clearly isn't stationary, but the next two rows suggest that the first- and second-differenced data has achieved stationarity.

Retail Gas Prices

Here is a decomposition of the retail gas prices in the United States. The trend shows prices rising in the 2000s. There is also a lot more 'error' or randomness around the middle of the time series.

Here is the ACF of the gas prices, showing a high correlation between values and the preceding two years of data points. This suggests it is not stationary.

The image below is the partial ACF graph of the gas prices. Like above, the first and second lags are outside the confidence interval, and then fade away to basically zero.

Augmented Dickey-Fuller Test on Gas Prices

I ran an ADF test on the food stamp data, and the resulting p-value was .6393, confirming that the data isn't stationary.

Moving Average Smoothing on Gas Prices

The figure below shows the gas prices with moving-average smoothing (in red) for 3-month (quarterly), 6-month (half yearly), 12-month (full year), and 24-month (two year) smoothing. The 12-month smoothing looks like perhaps it best captures the 'trend' in the data; the 24-month seems like it is losing really valuable information, like the spike in prices in 2008.