DATA STORIES | GEOSPATIAL ANALYTICS | KNIME ANALYTICS PLATFORM

How to Analyze Criminal Patterns in Baltimore (2010–2016) Using KNIME Analytics Platform

Helping law enforcement and policymakers make data-driven decisions to improve public safety

Abdullahjafri
Low Code for Data Science

--

Introduction

Baltimore, Maryland, is a major city known for its rich history and vibrant cultural scene. However, like many urban areas, it faces challenges related to crime. Understanding criminal patterns is crucial for enhancing public safety and effectively allocating resources. This article demonstrates how KNIME Analytics Platform can analyze crime data from Baltimore between 2010 and 2016, revealing key insights and patterns.

KNIME, a powerful open-source tool, enables users to perform complex data analyses using a low-code approach. This makes advanced data science techniques accessible to a wider audience, including those without extensive programming skills. By leveraging KNIME, we can efficiently process and visualize large datasets to uncover valuable insights.

Objective

The primary objective of this analysis is to identify and understand the patterns of criminal activity in Baltimore over a six-year period. By leveraging KNIME’s powerful data processing and visualization tools, we aim to uncover trends, hotspots, and other valuable insights that can aid law enforcement and policymakers in making data-driven decisions to improve public safety.

Dashboard & Workflow Overview

You can download the workflow and the report from the KNIME Community Hub.

Data Import

We begin by importing the crime dataset into KNIME using the “CSV Reader” node. This node allows us to easily load and manage large datasets.

Data Cleaning

Data cleaning is essential for accurate analysis. We use a meta-node (collection of multiple nodes) to split, manipulate, filtering unwanted data, adding geometrical locational points and convert data, ensuring it’s in the right format for our analysis.

1. First of all, we split the “Location 1” column into “Location1_Arr[0]” and “Location1_Arr[1]” column, enabling us to perform geo-spatial analysis on our data.

2. Now, removing special character using the “String Manipulation” node.

3. After this, we will now typecast our “CrimeDate” column from string to date column using “String to Date&Time” node.

4. Just like we did with the “CrimeDate” column. Now we will change the type of “CrimeTime” column from String to time using the “String to Date&Time” node.

5. Now, we will change the datatype of our two columns “Location1_Arr[0]” and “Location1_Arr[1]” from string to number using “String to Number” node.

6. Renaming our columns “Location1_Arr[0]” as “Latitude” and ““Location1_Arr[1]” as “Longitude” using “Column Renamer” node.

7. Replace missing values present inside the “Weapon” column with “Unknown” using the “Column Expression” node. Perform a similar approach on the “CrimeTime” column.

8. We will now filter out unnecessary columns.

9. Removing null values by using the “Latitude” column as a basis.

10. Now, cleaning the messy data present inside the “Inside/Outside” column

11. In order to run geospatial analysis on our data, we will now transform our “Latitude” and “longitude” columns into “geometry” column using the “Lat/Lon to Geometry” node.

With these steps, our data is ready for in-depth analysis.

Data Analysis and Visualization

Report Creation

To create a report, we use the “Report Template Creator” node. This node helps us handle report size and orientation (A4 paper size).

Dashboard Creation

We combine all visualizations into a stunning dashboard using the component node. This enables us to add text, graphics, buttons, and input fields. We use a “Value Selection Node” for user input and a “Refresh Button Widget” for submitting responses. The insights we obtained from the analysis serve as our text.

We will use a “Value Selection Node” to get user input, and a “Refresh Button Widget” will be needed to submit the response. We’ll use “text view” to include text in our report. The insights we obtained from the analysis will serve as our text in this instance. The year and the district name will be entered as our input.

To conduct a thorough analysis, we:

  1. Extract Time Elements: Extract day, date, month, and year to get more insightful data.
  2. Type Conversion: Convert the year column from number to string using the “Number to String” node.

3. Filter Data: Filter data using user-accepted inputs.

We use the “GroupBy” node to extract insights such as “Weapon used,” “Top 20 most dangerous neighbourhoods,” and “Most recorded criminal activities by district.” Visualizations like bar charts, pie charts, and geo-spatial views help us understand the data better. Once all the nodes have been combined, Then, we will select the Open Layout Editor button in the top header to add insights in our dashboard as needed.

This will show a view similar to the image below.

Findings

Here are the insights extracted from our data:

1. Our findings reveal that most criminal activities occurred in 2011, followed by 2013, 2012, and the least number of crimes were reported in 2016.

2. Among the months, August had the highest number of crimes reported with 26,560 incidents, followed closely by July with 26,473 incidents. February had the lowest number of reported crimes, with just 17,974 incidents.

3. Most of the crimes occurred between Baltimore Pen Station and John Hopkins Hospital.

4. In terms of days of the week, Friday had the highest number of crimes, while Sunday had the lowest.

5. Downtown was identified as the most dangerous neighbourhood in the city, whereas Oliver was the safest with 2,400 crimes reported.

6. Regarding districts, the Northeastern district had the worst record, with 44,832 crimes reported throughout the years. In contrast, Gay Street was the safest.

7. The most frequently used criminal code was 4E, followed by 6D. 8. Additionally, our data shows that approximately 49% of the reported cases were committed by insiders.

Report Export

Once the dashboard or report is created, we save it in PDF format using the “Report PDF Writer” node and in HTML form using the “Report HTML Writer” node. This allows us to share the report with colleagues and other stakeholders.

Conclusion

This analysis demonstrates how the KNIME Analytics Platform can be used to gain valuable insights into criminal patterns in Baltimore. By understanding these patterns, law enforcement and policymakers can make data-driven decisions to improve public safety. Future work could include expanding the analysis to other cities or incorporating additional data sources for a more comprehensive view.

Thank you for reading this article. If you have any questions or suggestions, please feel free to contact me. If you enjoyed this post, please share it with your KNIME buddies.

Thank you, Roberto Cadili and @LowCodeforDataScience, for giving me the opportunity to share my blog.

--

--