Better Crime Discovery
Visual analysis for understanding crime.
By Sagar Chand and Tanishq Kaushik
Did someone tell you that crime in your locality is increasing? If yes, then how can you be certain? And how can you know whether all kinds of crimes are increasing or only specific ones? Crime in a city encompasses a multitude of crimes that all fall under many categories. Understanding and exploring crimes within different categories can be quite difficult especially when you do not know what you are looking for. If crime is indeed rising then how do you know if it would impact you, your family, your business, or your visiting family and friends?
Crime data is very easy to access in this day and age. For example, you can easily access it by going to the FBI website or various other platforms. The various platforms have features similar to the FBI’s Crime Data Explorer website where you are presented with multiple visualizations that you can filter through. However, these platforms and the FBI’s website only allow you to filter the data based on one feature and do not allow for much customizability when it comes to viewing other aspects of the data. This can easily limit the information that an individual in the general public is able to extract from those visualizations.
Aside from the preset visualizations displayed on these websites that limit customizability and the understanding of the data, the general public also has easy access to the overall data regarding the crimes. However, there is a major issue with this. While the data is easily accessible, it is generally accessible as raw data, which the general public cannot utilize as they do not have specific data analysis skills. Therefore, while the data and information are out there for anyone to access, understanding the insights within the data is a problem of its own when considering the general public. This means that an individual is not able to view specific data that they are interested in or crime statistics that they would like to be aware of. What can we then do that would allow the general public to not only know what type of information and statistics they should be interested in but also be able to explore the crime data and extract insights based on their specific queries?
For our project, we initially created multiple AWS Quicksight dashboards that provide the functionality required for the general public to easily explore specific data. However, this does not solve the overall problem. Even if we provide dashboards that allow the user to explore the data, they are likely to get overwhelmed and then get discouraged. We first discovered this through user evaluations of our prototype. During the user evaluations, we discovered that while we provide the ability to select specific data to filter and explore, the user was overwhelmed with options which caused them to only explore a little and then get discouraged as there is so much that they could explore, all without a starting point. After conducting some further research we discovered that while we as data scientists are accustomed to exploring data and extracting insights, the general public is more interested in specific scenarios pertaining to their situation where they would be able to utilize the dashboards to extract information and insights regarding trends.
Therefore, after learning what the general public wants, we broadened our scope to include dashboards pertaining to the various use cases we developed. We created three use case scenarios and each aimed at a different group of individuals. The first use case revolved around a family with children moving to a new zip code and exploring specific crime data based on their interests such as school shootings, kidnapping, car theft (so that they know where to park), and various other types of crime. Our second use case revolved around a business owner deciding to open a new shop in a specific zip code or location. This use case also provides specific information regarding crimes that the business owner should be aware of such as shoplifting, breaking and entering, …etc. Finally, our last use case revolved around a tourist traveling to a new city or location. In this use case, we provide the specific crimes that a tourist would be interested in as well as a map that they can explore to see which streets are safe for them to walk or park their car and which are too dangerous.
Through another round of user evaluations, we found that utilizing use cases helps the user by giving them examples of how they can filter the data based on their own personal needs. However, there were still limitations as the user did not fully understand the various columns that they could filter through as the data structure regarding crime categories can be difficult to understand. Based on this we altered our design process to provide comments next to the dashboards for each visualization that explained what features of the platform they can use and what insights they can gather. However, this did not feel enough as we did not bridge the gap of understanding regarding the data structure.
To bridge this gap we altered our project again to provide contextual information regarding what the main columns in the data set meant and are useful for. We also provided an explanation of the breakdown of the various crime categories and how the hierarchy of the crime categories relates to one another. In addition, we also decided to add the use cases to the Google Collaboratory notebook where we are able to discuss in depth the various findings that the use cases portray and how they could be used. Adding these features to our design process/dashboards allowed us to bridge the gap for the contextual understanding of the data structure. However, this did not mean we were able to counter all the shortcomings, as there was still some confusion regarding dashboards.
During our final user evaluation, we discovered that while we had provided comments next to each visualization in the dashboards, the user was not taking advantage of the functionality of the platform to better explore the data. We had assumed that by creating a document providing the context and use cases that the user would then be easily able to explore the data, however, we were wrong. We also discovered that after looking at and understanding the various use cases, the user did not believe that it had fulfilled their information need. Based on this and the feedback we received we made changes to our dashboard which included a tutorial dashboard with comments next to the visualizations that focus only on understanding the various functionalities of the platform allowing the user to easily filter the data.
Finally, in hopes that the user would be able to see the difference in their understanding of crime data before and after utilizing our product we decided to show the exploration dashboard first so that the user can witness their lack of understanding. After which the user is then recommended to view the Google Collaboratory notebook to first get a good understanding of the structure of the data as well as familiarize themselves with the different use cases. Then the user would explore the tutorial dashboard to understand how to utilize the platform to filter and explore the data, and finally, look at the various use case dashboards that we created. After these steps, when the user once again views the exploration dashboard they are able to notice how lost they felt when they first looked at the exploration page versus how confident they feel after the notebook, tutorial, and use cases. This would then prove that they were able to bridge the gap of understanding and efficiently explore the data that they wish to gather insight on.
Throughout the various findings, we discovered during the development stage of the project, we made alterations to our design processes that would allow us to fulfill our goal of not only providing a platform/dashboard that allows the user the customizability of filtering data based on their preferences but also is able to bridge the gap of their understanding of crime data so that they can independently explore the data without help from data specialists.