Chicago Crime Rate Forecasting using FbProphet

Eashan Kaushik
Jul 27 · 5 min read

This blog post focuses on analyzing the Chicago Crime rate from 2001 to 2016 and predicting crime rate statistics in 2017 given the historical trend. Chicago is one of the largest cities in the US, and the city’s overall crime rate, especially the violent crime rate, is higher than the US average. Dataset has been obtained from the Chicago Police Department’s CLEAR (Citizen Law Enforcement Analysis and Reporting) system and can be downloaded here. The dataset contains 7,941,282 entries for 16 years. The Dataset contains the following columns:

  1. ID: Unique identifier for the crime incident.
  2. Case Number: Records Division Number assigned to the incident.
  3. Date: Date when the incident occurred.
  4. Block: Block address of the incident.
  5. IUCR: The Illinois Uniform Crime Reporting code.
  6. Primary Type: The primary description of the crime according to the IUCR code.
  7. Description: Secondary description of the incident.
  8. Location Description: Description of the location where the incident occurred.
  9. Arrest: True or False if the convict was arrested.
  10. Domestic: True or False if the incident was domestic-related.
  11. Beat: A beat is the smallest police geographic area — each beat has a dedicated police beat car.
  12. District: District where the incident occurred.
  13. Ward: The City Council district where the incident occurred.
  14. Community Area: Chicago has 77 community areas.
  15. FBI Code: Crime classification according to FBI’s National Incident-Based Reporting System (NIBRS).
  16. X Coordinate: The x coordinate of the incident.
  17. Y Coordinate: The y coordinate of the incident.
  18. Year: Year the incident occurred.
  19. Updated On: Date and time the record was last updated.
  20. Latitude: The latitude of the incident.
  21. Longitude: The longitude of the incident.
  22. Location: The location of the incident

Note: This study is strictly meant for educational purposes and is meant to assist individuals to analyze data. Due to this reason I have used data from 2001 to 2016, and not the more recent data.

FbProphet

The Facebook prophet is a time series forecasting model, which at its core is an additive regression model. It works well with data having strong seasonal effects and several seasons of historical data. The Prophet is also robust to outliers, missing data, and dramatic change in time series. Prophet provides two main advantages as compared to other forecasting models, Prophet makes it much more straightforward to create a reasonable, accurate forecast and Prophet forecasts are customizable in ways that are intuitive to non-experts. Four main components of FbProphet include:

  • A piecewise linear or logistic growth curve trend. Prophet automatically detects changes in trends by selecting changepoints from the data.
  • A yearly seasonal component modeled using Fourier series.
  • A weekly seasonal component using dummy variables.
  • A user-provided list of important holidays.

Data Analysis

As you can see in the above graph, the number of crimes drastically increased from 2006 and then started decreasing in 2011. Ever since 2011, we can see a negative trend in the number of crimes. At this point, I propose a hypothesis that crime in Chicago will continue to decrease in 2017-2018.

The Chicago police department has been successful in decreasing the total number of crimes. The blue bars show the crimes which did not lead to an arrest and the orange bars show crimes that led to an arrest.

The Highest crime rate is seen in district 8, and the lowest in district 20. (District 13, 21, 23, and 31 have zero crime rate, which might be because of incomplete data)

The graph shows that Chicago streets are the most unsafe, with the number of crimes on the streets being more than double the number of crimes on sidewalks.

In the following graph, you can see that theft is the most common crime in Chicago, only 10% of the total thefts lead to an arrest, the rest remain unsolved.

The orange markers show the crimes that lead to an arrest, and the blue markers show the crimes which do not lead to arrest. Only in a small portion of the total crimes, the culprit is apprehended.

This plot shows whether the crime was domestic or not, orange markers show domestic-related crimes and blue markers show non-domestic crimes.

Model Fit

Download FbProphet by following these steps:

1. conda create -n fbp python=3.8.82. conda activate fbp4. conda install numpy cython -c conda-forge

5. conda install matplotlib scipy pandas -c conda-forge
6. conda install pystan -c conda-forge7. conda install -c conda-forge fbprophet

Fit the model on a dataframe containing the date and number of crimes on that particular date. Rename the dataframe using following command:

df = df_prophet.rename(columns={'Date':'ds', 'Crime_count': 'y'})
from fbprophet import Prophetprophet = Prophet()
prophet.fit(df)
forecast = prophet.predict(
prophet.make_future_dataframe(periods=365))
figure = prophet.plot(forecast, xlabel='Years', ylabel='Crime Rate')

Results

The graph shows a negative trend in 2017–18, this is in line with our initial hypothesis that Chicago is increasingly becoming a safer and safer city.

Furthermore using FbProphet we can analyze the following graphs as well:

figure = prophet.plot_components(forecast)

The Graph shows the highest crime rate from March 1st to May 1st and lowest crime rate from November 1st to March 1st.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…