It’s Data, Man: OPI², in retrospect

A summer intern’s reflection on Waze CCP, Power BI, and the future of civic governance

It’s hard to say what exactly I expected out of my summer internship at Louisville Metro’s Office of Performance Improvement and Innovation (OPI2, for short). Prior to the summer, if someone asked me what my job would involve, I’d answer with some variation on “analyzing internal performance data and finding innovative ways to integrate technology into city services.” (My dad had a shorthand way to say this: “consulting.”)

It turns out that I, along with co-interns Devika, Kellen, and Parker, would focus on the second “I” in OPI², joining the Innovation half of the office. In retrospect, however, the summer was about much more than innovation. Working under Ed Blayney, Innovation Project Manager, we got a deep dive into some things (Waze data), a crash-course through other things (city government), and a new perspective on how we can achieve a future of data-driven civic governance.

First, the project I focused on: analyzing Waze traffic data in order to evaluate the impact of signal retiming projects and roadway reconfiguration projects.

Phew. Just like OPI²’s full name, that was a mouthful. Let’s backtrack.

Different Waze to analyze the data

Note: See the first part of this story where we do free traffic studies

Two years ago, Louisville Metro (the combined city-county government of Louisville and Jefferson County) joined the Waze Connected Citizens Program, giving Louisville realtime traffic data in exchange for information on planned road closures. Louisville Metro subsequently built a code processor to log this realtime data in an internal Microsoft SQL Server database. My job was to analyze all the historical data that had been logged in order to determine how traffic signal retiming projects were impacting congestion patterns along several high-traffic corridors.

My first assignment was Outer Loop, a suburban corridor that has seen increased traffic in recent years, in part due to Louisville’s sprawling population growth in the suburbs. In response, in January 2017, Brandon Shelley, a traffic engineer in the Public Works department, created a coordinated timing plan for eight signals between Interstate 65 and the Jefferson Mall. Using R and RStudio, I analyzed the jams data before and after the retiming was conducted, breaking down the data by direction of travel, time of day, and intersection at which the jam occurred.

Using R and RStudio, I wrote a script that produced a script that compares year-over-year change in traffic counts using a table and using a line chart. Filters for travel direction, time periods, intersection, and month allowed me to drill down into the data in different ways.

After conducting the analysis, I sat down with Brandon to review my findings. I thought it’d be a quick data review, but no—instead, it turned into several sessions of asking questions about the outliers and digging deeper into different ways to filter the data. One of the important conclusions we came to was that even though overall traffic jams didn’t seem to very conclusively decreased, they certain did along the first half of the corridor, before hitting the Jefferson Mall (and that’s where it matters).

I began to apply the method to other corridors, such as Bardstown Road, which seemed easy enough: pull data for the corridor, plug it into R, and run my script, filtering by time period, direction of travel, and intersection as needed. But I also had to consider what would happen after I left Louisville. In speaking with my co-workers at the OPI², only one of them knew R, and nobody actively used R in their day-to-day jobs. So I turned to Microsoft Power BI — a business intelligence and data visualization tool, similar to Tableau or Metabase — because it was already being used by OPI², and it was being adopted across Louisville Metro. I was admittedly reluctant to commit to Power BI — after all, it’s a proprietary format and it’s a large, resource-intensive application, unlike R’s lean, open-source nature — but amidst an organization-wide shift to Power BI, I decided that Power BI would be right way for the team to carry my work forward.

Updating the SQL query used to pull the jams data. I took the existing query and made it easier to tell what code is doing what (by adding a subquery) and added a case statement to sort jams by intersection, among other improvements.

Power BI, M, DAX, and SQL: Languages galore

Recreating the features of my R script in Power BI presented its own set of challenges: while Power BI makes it incredibly simple to create basic visualizations (bar graphs, line graphs, and even maps), I quickly hit a ceiling on what I could do without consulting Microsoft’s support docs and Stack Exchange. The biggest difficulty came in replicating the intersection analysis, which I only cracked after consulting Mary Hampton, Data Scientist at the Office of Civic Innovation.

In the end, I significantly expanded the template I started with, breaking the jams template out into five pages:

  • Jam analysis, which breaks down jam counts by month and now provides year-over-year percentage change for each month
  • Jam level breakdown, which shows how the severity of jams over time
  • Time period breakdown, which shows where jams are occurring during different time periods and allows slicing up the data in chunks of 1 hour, 30 minutes, 15 minutes, 5 minutes, or continuously. Sample use case: let’s say we run a coordinated plan for the AM Peak starting at 6:30 AM, but we see a lot of jams building up starting at 6:00 AM or 6:15 AM. We would then try to run the coordinated plans 15 or 30 minutes earlier.
  • Geographic breakdown, which shows how jam counts and jam levels at specific intersections change over time. Sample use case: let’s say there’s an unexplained spike in jams during May 2018. Using this page, we can see whether this spike is due to increased jams at specific intersections, from which we can try to pinpoint a one-time event, like a construction project at that area of the corridor or a bad loop, that explains the outlier.
  • Before-and-after comparison, which provides a side-by-side view of jam counts for two time periods (such as before and after a retiming)

Each page also includes filters for direction of travel, time group, and intersection, giving us more ways to drill down into the data.

The new “Time Explorer” page I created. The left panel allows us to filter by street, direction of travel, day of week, and intersection. The right panel shows a few things: first, that most jams start at around 6:30am; second, that most jams die out by around 8:30am; and third, that from 6:30am to 8:00am, most jams occur northbound before hitting the Gene Snyder freeway.

Along the way, with the new Power BI and SQL skills I had gained, I also created two new templates: one that allows us to quickly see when we first started tracking data for a specific corridors (Waze requires us to mark a corridor as “tracked” before we can pull travel speed data), and one that acts as a wrapper for an open dataset that contains coordinates of all signalized intersections in Jefferson County.


Creating a culture of data governance

Outside of just my Waze project, one part of the OPI² experience that really sets it apart is that we get a broad look at different parts of city government. Interspersed within our daily work, we met with leaders at the city’s Sustainability and Globalization departments, seeing first-hand how the city is responding to the challenges facing cities across the world. We sat in on LouieStat forums for Louisville Forward (the city’s economic development agency), Emergency Services, and Information Technology, witnessing how OPI’s work was helping to transform performance evaluation across the city and deliver more efficient services. And, with the help of Ed’s connections, I also got to attend Advanced Planning meetings, getting a glimpse of what it’s like to be an urban planner (a career path I’m considering).

There was one thing that kept coming up, however, was data. My job, of course, involved getting dirty with traffic data from Waze, with the goal of proving the efficacy of Brandon’s signal retiming projects and convincing policymakers to give the traffic department more funding. But Globalization also justified the importance of the office’s outreach work by citing statistics on the growing immigrant population in Louisville. Louisville Forward gave concrete statistics on the economic impact of their business incentives. And departments from IT to EMS reported metrics on overtime pay and paid time off, in order to highlight employee efficiency and productivity.

Petting a wallaby at the Louisville Zoo while on our private behind-the-scenes tour! Data governance extends to every agency of Louisville Metro, even the zoo.

All across Metro, from the spacious offices of OPI2 to the backseats of ambulances to the exhibits of the Zoo (we got to pet wallabies!), I was witnessing the fruit of efforts to create a culture of data. Mayor Fischer, armed with his business background, plays a big part in that movement. But so do the endless supporting cogs that are the day-to-day manifestations of data governance: the aptly-named Data Governance group, which brings together leaders from across Metro to teach them data tools to use in their own department, and the Waze Connected Citizen’s Program, a pioneer of public-private data-sharing efforts, are two things that come to mind.

I’ve realized that creating this data-driven culture isn’t a simple card to check off on Trello, nor is it a simple project you can manage and finish. Instead, it takes time and dedication to set up LouieStat forums, to write those SQL queries, to be persistent in pushing for open data, to teach people like Brandon to adopt tools like our Power BI template. Perhaps most of all, it requires buy-in from those in power to actually practice data- and evidence-based decision making.

Data governance is an ongoing process, a generation-long journey. Louisville has been on that journey for eight years and will continue to be on that journey for many more. And I’m proud to have shared in that journey — even for just one summer.