Data and Place: School and Crime in Chicago

Beckett Dumo, Hampton Iverson, Adriana Rivera, Tahira Zafar

Beckett Dumo
7 min readMar 16, 2024
Photo by Tim Gouw on Unsplash

Abstract

Data brings a new perspective to global locations that other methods can’t. We sought to collect and analyze data on a place and visualize characteristics in a dashboard. Using Chicago as our location, we compared the perceived safety of students taken from surveys to the actual safety using crime data. To accomplish this, we compared the school safety scores derived from the annual surveys to the number of crimes in each area. School safety scores are calculated based on numerous factors, including incident reports, environment and support scores, and engagement scores. Geographical data allows people to connect the data to the place spatially, so we put this data on a map of Chicago. We found that people could generally predict how safe their neighborhood is.

Our Final Design

Final Data Dashboard

Stage One: Map

We decided early on that Chicago could be interesting to investigate and visualize. To begin our design process, we looked at Chicago’s open data initiatives for cohesive datasets. We noticed an extremely detailed crime dataset with thousands of rows and multiple location-based data columns. We began looking for datasets whose trends could work well with crime data to create an intriguing narrative. A recurring theme of school datasets prevailed throughout the open data initiative. We concluded that if a dataset from a school that matched the year of our crime data was available, we should take a look. We scoured multiple datasets to see if any columns had related and spatial data. We decided on a dataset containing all Chicago school-level performance data used to calculate CPS report cards for each school. The metrics included many columns from the five essential surveys that Chicago students take and analytics of academic performance. One thing of note was that the survey scores for several schools were missing, so we opted to drop all null rows and bad lines.

Stage Two: Sketch

Using the knowledge gained from our data exploration, we began our ideation process by sketching and theorizing approaches to a data dashboard. Using the elements we found during our mapping process, we decided that a story about crime and schools would be interesting. We used these questions to guide our story as we created higher-fidelity visualizations:

  • What regions of Chicago have more crime?
  • What crimes are the most prominent?
  • Do safety scores correlate with other survey scores?

We developed an initial sketch of what this may look like as a choropleth map:

Initial Sketch

In addition, we developed a prototype of what a whole dashboard would look like:

Full Data Dashboard Sketch

Stage Three: Decide

Following our sketches, we began converting our low-fidelity sketches into Altair prototypes. When we initially began deciding how to visualize our datasets, one of the first hurdles we encountered was deciding what interesting story they could tell and how well it would translate into a visualization. To help guide our final designs, we sought reviewers who could give us feedback on our visualizations that we may be too biased to see. We interviewed these helpers on our design dashboard, asking questions to help improve our initial designs. We used our first Altair prototype to learn more about how our visualization communicates our story.

Interviews

Using this choropleth in conjunction with contextual information on the dataset, we asked reviewers to divulge the effectiveness of the map, its weaknesses, and strengths. Along with critiques on the choropleth, we asked about the best way to support the main map through secondary charts and annotations.

Early Altair Choropleth that we used for Interviewing

Name One: Cole Thoni

Cole commented on how the map worked well with the dots. He thought the color scheme of the regions did a good job of differentiating them. In contrast, he disliked the dots and felt they needed to be colorful. Although the graph was easy to understand, a legend would go a long way in assisting comprehension. He remarked that he didn’t know much about graphs but thought some graphs comparing the scores in the survey to each other could be interesting.

Name Two: Nico Rodriguez

Nico liked the story we were conveying and thought the idea was intriguing. After looking at the choropleth map, he didn’t feel it was too busy or confusing. He said it was simple enough to get the main information but felt it would be beneficial to have another chart or visualization that showed more of the correlation in numeric terms, maybe through a line or scatter plot. This led us to create a scatter plot showing the correlation between perceived safety (school safety scores) and actual safety (number of crimes).

Name Three: Tony Anderson

Tony liked the calming color gradient but thought the dots did not stand out enough, especially when they were smaller on darker backgrounds. They also pointed out that dots to the East of the map that did not have any correlating blue were confusing and distracting. They noted that the lack of context would make it hard to understand if there was not someone to explain it to him.

Name Four: Julie Simpson

Julie seemed to be confused by the lack of annotations or context to the visualization. She said it was visually appealing but lacked the context to actually convey information to the viewer. After explaining the general idea of the visualization to her, she said it was an interesting concept to follow, but the current graph did not seem to fully convey that information.

Stage Four: Prototype

Our Prototype Data Dashboard

In the prototyping phase, we added more information but kept it fairly plain. We felt that a simple aesthetic with a central color scheme would help the charts flow as one cohesive dashboard. We added additional charts to give more context to the main chart. We also changed the colors from the original so that the crime information stood out more from the school information, and we physically arranged the charts to give the viewer a visual flow to follow.

Demo Day Tests and Feedback

Demo Day gave us valuable feedback that we took into account for our final designs. The feedback generally praises the visual appeal of our dashboard and visualizations. People liked the color scheme and how the charts fit together to create a story. The main feedback we received was focused on improving the clarity of our dashboard by adding text to add context, omitting irrelevant graphs for simplification, and making it easier to discern information from the alternative scoring bar charts. There was also some overall confusion about what school safety scores mean, how they relate to the perceived safety score of a neighborhood, and what the secondary map on the right side of the dashboard represents.

Stage Five: Final Design

After several iterations of design and feedback, we landed on this as our final design:

We added more context to our dashboard through annotations to clarify what information was important for the viewer to understand. We originally had a diverging color scheme for the choropleth map but changed it to a sequential color scheme to indicate differences between safety scores more clearly. Sequential color schemes are more suited for ordered data that has a range from one extreme to another. Since we have a range of safety scores going from low to high, we thought this scheme would be better for our data.

Our supplementary charts are all based on feedback we received from interviews. We added a scatter plot to show a clearer correlation between safety scores and crime. Another individual also noted from our 1-Slide pitch day that seeing the most prevalent types of crimes would be interesting. This led to a bar chart showing Chicago's top 7 crime types. There are many more types of crime, but we decided to show the top 7 because research indicates that 7 to 12 is the ideal number and that any more would confuse the viewer.

We also changed the title of the bar charts on the right because the original title (“Alternative Scoring by Region”) led to some confusion. Changing it to “School Rating Metrics by Region” provides more information and context for those charts. Adding letter labels to the bottom of the bar chart also makes the map below more clearly linked to the charts.

Takeaways

Our overall takeaway from the dataset was that there did not seem to be an extremely strong correlation between the average safety scores per neighborhood versus the number of reported crimes in 2011, as witnessed from the scatter plot in our final design. However, the data points on the map show that the neighborhoods with lighter coloring (lower safety scores) generally have more reported crimes. The smaller-sized neighborhoods with low safety scores are skewing the data a tad since, presumably, they are less populated. Therefore, fewer crimes are being committed. When split into regions, Regions B and E have the highest safety scores per the School Ratings Metric bar graph. They also had the highest instruction and environmental scores, meaning that this could indicate a pattern between these variables. Out of the crimes in Chicago, theft and battery were the most common.

Resources

“HCL-Based Color Palettes.” • Colorspace, colorspace.r-forge.r-project.org/articles/hcl_palettes.html#references. Accessed 15 Mar. 2024.

Frost, Adam. “Rule 17: Not Too Many Bars.” AddTwo, AddTwo, 5 Sept. 2023, www.addtwodigital.com/add-two-blog/2021/6/16/rule-17-not-too-many-bars.

--

--