Semester Project: A Circle back to DC’s Parking Enforcement

Wadi Ahmed
INST414: Data Science Techniques
8 min readMay 15, 2024

In my first post, I detailed about some insights regarding D.C’s parking enforcement, and here, I finish up the job

Really, I feel like some of these signs are made to be confusing…

In the beginning of this class, I focused on D.C as my original post regarding patterns and impacts I could find about it. I found that there was some information I wanted to talk about regarding the parking enforcement to see if lessons from this class could be applied to finding some kind of pattern to equip drivers with tools that can help them avoid tickets (drivers especially like me!)

Background and Stakeholders

D.C and its ticketing system only grows more egregious, with one speeding camera racking up more than $3.8m in revenue across the first half of 2023. As the city struggles to get more in revenue, making sure that tickets are enforced is only a natural consequence, and considering that more than $300 million tickets are in the hands of both Maryland and Virginia drivers, the D.C government has recently started to take action to crack down on offenders in the district. Most recently, the D.C government has placed cameras on bus lanes in the district to crack down on drivers and parkers in the bus lanes as part of the Clean Roads Initiative made to improv bus travel times. These fines go up to $100, and with rules in D.C, these can easily double up quickly.

Now, one of the disclaimers I want to put in this post is that the goal is not to avoid the law and its consequences: if you disobey the rules, consequences will result from them. However, these laws don’t reflect some realities that might exist in the district: already struggling for parking spaces, delivery drivers, rideshare drivers, and people like them risk getting tickets that can hurt their gaining of income while going out. With this information, the main question we want to answer for these drivers: is there a way to predict when officials will enforce parking rules, and if so, when are those times? For drivers, this can help them know when the best times to travel to the district are, many times usually in the off-hours when traffic isn’t as bad. This can greatly help reduce the amount of cars in the District at a given time (yes, for slightly unethical reasons).

Dataset

The ideal dataset is the one I collected already that contains important information such as:

  • Agency Information — Information about the agencies responsible for issuing tickets
  • Latitude and Longitude- Geospatial data is crucial for understanding the spatial distribution of parking tickets issued across different areas within the D.C. area
  • Temporal Data — Time-related attributes were essential (date, time) to find out the times where there was least enforcement.

All of this data was received from OpenData D.C, a government website focused on publishing all D.C data for planning, records, and reports to the public domain. In order to clean the data, all N/A values were removed, then all non-DC agencies (federal police, private parking, etc.), were remvoed from the dataset, as we wanted a focus on local D.C government. The time was then sorted from an integer format to a datetime stamp as well as Unix Time (to make it easier for calculating time using K-means).

Methods

The extension of this project heavily relied on methods from data analysis and machine learning concepts, most notably with K-means clustering, to understand patterns in parking enforcement activities across D.C.

K-means clustering is a unsupervised machine learning algorithm used for partitioning a dataset into groups based on similarity in features. It aims to minimize the within-cluster of sum squares. The choice for using K-means lies with it’s great ability to analyze enforcement activities, namely:

  • Pattern Identification — K-means clustering allowed me to identify patterns in parking enforcement, namely grouping together agencies based on the time of day they’re mostly active.
  • Efficiency — K-means clustering is computationally efficient and can handle large datasets such as this one, making it suitable for analyzing this dataset.

In this project, K-means clustering allowed for identification of agencies with similar ticketing patterns, namely agencies that are active during specific periods of the day.

By leveraging K-means clustering, the project helped uncover insights into the temporal behavior of parking enforcement agencies in the D.C. area, which could potentially inform decision-making for individuals trying to keep their wallets safe.

Analysis

The first thing done was to clean up the data as described above to remove any unneeded variables that might confuse me later on in my implementation. From this, I grabbed the hour from the time to give a flat range of times throughout the day. In creating the K-means, I ran an elbow curve to calculate the best amount of clutters to use, which came out to 4. From these 4, I split these clusters into 4 categories, the early morning (between 12AM-6AM), morning (6AM-12PM), afternoon (12PM-6PM), and night (6PM-12AM). These were then used for the numerical agency codes corresponding to D.C government, which are:

1 - Metropolitan Police Department (District 1 - Capital Area, SW DC)
2 - Metropolitan Police Department (District 2 - NW DC)
3 - Metropolitan Police Department (District 3 - Center of NW and NE DC)
4 - Metropolitan Police Department (District 4 - NE DC)
5 - Metropolitan Police Department (District 5 - NE DC)
6 - Metropolitan Police Department (District 6 - SE DC)
7 - Metropolitan Police Department (District 7 - SE DC)
10 - Office of the Chief of Police of MPD
13 - District Department of Transportation
14 - MPD Property Division
15 - D.C Department of Public Works
21 - MPD Reserve Corps
25 - MPD Special Operation of Division and Traffic
32 - MPD Special Inspection Division

After putting this down, I ran a K-cluster to start putting these numbers together, and as a result: these were the results that came about:

Cluster Legend:

Cluster 1 - Green(10,599 tickets) - Between 12:00 PM - 4:00 PM 
Cluster 2 - Red (55,745 tickets) - Between 6:00 AM - 11:30 AM
Cluster 3 - Black (35,871 tickets) - Between 7:00PM - 12:00 AM
Cluster 4 - Blue (1,581 tickets) - Between 12:00 AM - 5:00 AM
K-Cluster Analysis between Hour of Day and Agency Code

This gave me an average of when enforcement takes place on an average day in the district. Now, on first sight one of the notable things about this graph is how majority is that the vast majority of enforcement is done by the Metropolitan Police Department, which usually just involves a ticket, whereas if it is Department of Public Works or DDOT, towing could be involved. Using Longitudinal and Latitude features, I was able to create 8 clusters and see these agencies based off location and see how that would factor out.

The color scheme to interpret the chart:

Cluster 0 (23,834 tickets) - Red
Cluster 1 (10,730 tickets) - Blue
Cluster 2 (5,970 tickets) - Green
Cluster 3 (4,896 tickets) - Purple
Cluster 4 (23,084 tickets) - Orange
Cluster 5 (11,438 tickets) - Dark Red
Cluster 6 (20,447 tickets) - Black
Cluster 7 (2,783 tickets) - Beige
Clustered Area of Tickets based off of location

Fitting very comfortably with the D.C wards, From looking at this graph and seeing the numbers of entries per cluster, I was able to deduce the trend of how many tickets are being handed out by each agency. I evaluated these clusters by the amount of tickets present in each of them.

What does this mean?

In conducting this rudimentary analyses, I was able to find some trends and patterns regarding how ticketing works in D.C in an average month of D.C. While there is a big disclaimer that these are only estimates based off of what we can see in an average month of D.C parking enforcement, the risk is always present that these do not guarantee you will not get a ticket.

In referencing the times of day, one of the most notable agencies that are operating 24/7 is MPD in regards to ticketing, as their patrol occurs throughout the city. The Department of Public Works and District Department of Transportation operate similarly, though not with the same frequency. Most of these tickets are also given out between 6AM-12PM and 7PM-12AM, when there’s a massive exodus out of the city for rush hour as well as people trying to go out of town. Per agency, the makeup is a lot more vague, however, the DPW and DDOT operate more in the mornings until 2PM and later at night, most likely to go out for any towing operations when less traffic is present. Majority of the outliers that cross over usually are from different departments in the MPD (Special operations, etc.), that ticket more based on oppurtunity instead of actively patrolling.

For location, Cluster 5 and 6 make up most of the tickets, being located in the downtown sector of D.C, followed by the Northeast and Northwest divisions of D.C located in Cluster 4 and Cluster 0, respectively. The Northeast corridor coming down Route 1 and Rhode Island Ave. (Just down the street from UMD), as well as Southeast D.C have the least amount of tickets, giving your percentage chances of getting a ticket in these areas around 7.4%, compared to a 22.24% chance with Cluster 4 in NE DC.

Limitations and Conclusions

This analysis centered on identifying trends to minimize the likelihood of receiving a parking ticket in Washington D.C. Utilizing k-means clustering, we delved into parking enforcement data published by D.C, aiming to discern patterns regarding which agencies issue the most tickets, the times those agencies operate, and pinpoint areas with the highest probability of ticketing. As part of our preprocessing efforts, we converted integer time values to a standardized Unix format, enhancing data readability and enabling precise temporal analysis. This transformation facilitated the identification of peak ticketing times and hotspots across the city, providing valuable insights for optimizing avoiding strategies for individuals such as Doordash and Uber Drivers.

However, it’s crucial to acknowledge the limitations of our project. These include potential biases inherent in the data, such as underrepresentation of certain demographics or neighborhoods, which could impact the generalizability of our findings. Moreover, ethical considerations regarding data privacy and fairness, as well as the legal issues stemming from helping drivers prevent getting tickets, might make this data less appealing for some. These issues must be carefully addressed to ensure equitable outcomes in making sure this data is as robust as it needs to be. Despite these limitations, our analysis offers valuable insights for stakeholders seeking to enhance parking management policies and mitigate the risk of parking violations in Washington D.C.

Github Link for Code: https://github.com/CaptFalc/INST414-Final-Project

--

--