Visualizations

Samarth Goenka
Data Divas
Published in
2 min readDec 4, 2017

Our visualizations will attempt to illustrate the accuracy of our model in different areas of San Francisco as well as the rough locations of simulated crimes forecast by our model. Once we optimize our machine learning model, we will calculate the proportion of correctly forecast crimes in each census tract and visualize these relative proportions using a choropleth map (see this article for a discussion of what constitutes heatmaps vs. choropleth maps). Once we have this working, it will be easy to extend our code to visualize proportions of true positives, false negatives, etc. in each census tract.

We also thought it would be interesting to use our model to simulate future crimes. To accomplish this, the user inputs a date between September 2003 and May 2015 (the temporal scope of our crime data). Our simulator then uses our ML model to predict probabilities of crime occurring in each census tract in the next hour, and uses a random number generator to determine whether a crime occurred in this simulated version of reality. The user also inputs a number of periods in the future to simulate, and a running total is kept of the number of simulated crimes in each census tract as well as the number of crimes that actually occurred. Based on this simulation, we plan to create

  • Two maps, each with a number of dots in each census tract, corresponding to the number of simulated crimes and the number of actual crimes that happened in each census tract
  • A choropleth map where the color scheme corresponds to the difference between forecast and actual crimes in that tract (as a rough error metric)

We’re using the following two links to guide our design of the visualizations outlined above, using Tableau and Python respectively:

http://sensitivecities.com/so-youd-like-to-make-a-map-using-python-EN.html#.Wi3sNFQ-eCQ

Actual visualizations coming soon, stay tuned.

— — —

--

--