Time series analysis of SpotSpotter data by seven police districts 2014–2017
As part of a project conducted with partnership with TheLab@DC, part of the City Administrator’s Office of Budget and Performance, I conducted time series analysis on the gunshot shot data in Washington D.C from 2014–2017. Two parallel time series models were used to determine chronological trends and predict gunshots for the subsequent 12 months across the seven police districts .
Research conducted by the Brookings Institute has shown that gun-related crime is selectively under reported and that the use of ShotSpotter data provides an alternative independent measurement tool.
ShotSpotter is an integrated acoustic surveillance system which uses sensors to isolate and triangulate the sound of a gunshot and alerts the authorities to the location.
Police Districts distribution is shown below.
This data only consists of gunshot reports only and does not incorporate any information on the efficacy of the gunshot outcomes.
The data spanned 51 months with daily observations of type and location of gunshots, with a final dataset with 1,0227 daily observations . The latitude and longitude data of each gunshot report was overlaid with police boundary map to allocate incidences to each of the six the seven police districts which had ShotSpotter data . The ShotSpotter data contained a large number of false positives in 2014–2015 with fireworks often being mistaken for gunshots around the holiday periods.
Examination of the data showed that the similarity of the movement of both multiple and single gunshot warranted the creation of a total gunshot variable.
Autoregression and stationarity
Time series datasets includes time as an extra dimension as compared to other datasets. The assumptions of autoregression assume that past values of a particular time series affect the values in future ones.
In a stationary time series, volatility across a time series is expected to return to the long-run stationary equilibrium after accounting for external shocks. A non-stationary series can expect to have external shocks become part of the system to dilute the signal and also cause spurious correlation. An Augmented Dickey-Fuller Test was conducted to confirm the stationarity of the time series data which showed insignificant p-values at 0.05 threshold.
Autoregressive models use multiple regression model variations to transform time lagged inputs to predictions in future time periods. Two models were deployed independently to validate each other’s predictions and analysis.
Auto Regression Integrated Moving Average (ARIMA )is a robust multiple regression model which shows the auto-correlation factor, the change in value and the behavior of the error term across time periods. The model also reduces a non-stationary series to a stationary series using a sequence of differences.
FB Prophet is a non-linear additive model developed by Facebook to model a logistic growth curve trends and generate uncertainty intervals. The model also models yearly and weekly seasonal component using Fourier series and dummy variables respectively.
The Auto Correlation Function (ACF ) determines the specific seasonality of the trends. Here, first order autocorrelation is prevalent among the data indicating presence of gunshots can lead to other firearm incidents with one to two days of the original incident.
Gunshots are fairly even distributed across the year with some increase apparent in April and November periods.
Both the gunshot prediction models anticipated a reduction in overall gunshots in most areas with the city limits except police areas 6 & 7.
Increase in gunshots were reported in July and the December-January period, again showing that issues around misclassification of fireworks as gunshots is ongoing.
Seven-day seasonality analysis shows an uptick in gun shots starting on the weekends, again mirroring the national trends.
Good News: Gunshots incidences are trending down across most of the city
Across the whole Washington D.C area the rate of gunshot incidents is trending down and expect to fall in 2019. This is consistent with ShotSpotter research which ranked the city in the top ten cities with the largest drops in gunfire in the 2017.
Bad News :Police Destricts 6 & 7 is not expected to show any reduction in gunshots
The Sixth District and the Seventh District covers portions of the Northeast (east of the Anacostia River) and Southeast quadrants of the city and have a mix of housing categories including a number of large public housing developments. These two areas have pockets of economic hardship and higher concentration of urban challenges.
Recommendations and next steps
The results suggest that gunshots incidents maybe reduced by increased police presence in during the weekends and in areas of gun violence in the subsequent days.
Police districts 6 & 7 can be targeted with extra policing resources to reduce the incidence of firearm violence, especially if resource become available due to the expected reduction in gun crime in other areas.
2018 has seen an increase in gun related in crime although the model predicts a decrease in gunshots. This may be explained by selective clustering of the gunshots and gun related homicides in Police Districts 6 & 7 or due to external shocks. The gunfire may also be less indiscriminate and more effective resulting in the increase of gun-related violence with a constant number of gunshots.
There is opportunity for a time series analysis on gun related violence and for testing the relationship between the two phenomena. Also, research on real estate features on gun crime and gun shots would add value to urban planning simulations.
Acknowledgements: This is a more focused examination on specifically gunshot data derived from a grader study on predicting gunshot incidences using 311 data and real estate values. I would like to thank my team mates, Brian Collins, Priya Kakkar and Kihoon Sohn for their contributions in data collection and analysis
 Police District 2 did not have any ShotSpotter data available