Express lane toll prices as an indicator of highway congestion.

Madison John
Madison John
Published in
9 min readNov 4, 2019
Photo by Mikechie Esparagoza from Pexels

Introduction

I live in the great city of Austin, Texas and have been here for over fifteen years. Like for so many other commuters across the world, traffic is a daily annoyance many of us simply cannot avoid. So it was with great interest when the city announced improvements to Mopac Expressway, one of two main highways servicing the city.

Express Lane

In October of 2017, the city of Austin added a variably-priced express lane to Mopac Expressway. This allows drivers to bypass congestion in the regular lanes by paying a toll. The toll prices increase with congestion in the express lane to discourage further use.

In lieu of actual congestion data, can real-time toll prices be used to make reasonable conclusions regarding traffic patterns?

More specifically:

  • Do the toll prices reflect personal observations of daily congestion?
  • Do the toll prices show price patterns across weeks or months?
  • Are there atypical days and are they one-off or systematic?

Data Set

The data in this study was sourced from Kaggle, having been compiled from the Central Texas Regional Mobility Authority. The data consists of toll prices for each entry and exit point combination along Mopac Expressway from January to September of 2018. Toll prices were recorded at 30-minute intervals starting at midnight, from Monday to Sunday.

The toll points for all analysis in this study will be limited to those that encompass the entire length of the express lane. For future studies, the remaining toll points could be included for different analyses.

TxTag vs. Pay-By-Mail

Let’s see what this data looks like out of the box and plot the prices throughout the time interval.

Figure 1: Prices vs. Date and Time, January-September 2018

At a minimum, we observe that the prices in the two charts vary widely with time, indicating that congestion rates do not hold steady. Beyond this, we require manipulation of the data and additional charts to make further conclusions.

As indicated in the two charts above, there are two price groups in the data set, the blue on the left showing the TxTag rate, and the green one on the right plotting the pay-by-mail rate. Visual inspection suggests the two price groups follow the same pattern and may be multiples of one another. This is confirmed by plotting the prices against one another in a scatter chart.

Figure 2: Pay-By-Mail Prices vs. TxTag Prices

Since there is a linear relationship between the TxTag prices and the Pay-By-Mail prices, further analysis in this study will be focused on TxTag prices and will be referred to as ‘rate’ in the charts.

Price Distribution

First, we need to understand more about the price range and its distribution. We accomplish this with histograms, as shown below.

Figure 3: TxTag Prices Distribution

Observations

  • Prices are heavily distributed towards the lower price range, indicating lower congestion for the majority of the time.
  • Filtering out prices $0.50 and below produces a much clearer view of the price distribution during medium to heavy congestion.

The $0.50 threshold was chosen based on personal observations. Even in light to no traffic, unless the toll was completely disabled, prices for the toll points chosen did not dip below $0.50. Further analysis will make use of the $0.50 price filter to draw conclusions and will be indicated in the charts.

Additionally, the northbound and southbound directions will be analyzed separately to observe differences in prices between the directions.

Do the toll prices reflect personal observations of daily congestion?

Single Day

My own personal observations, having been driving on highways for nearly twenty years, leads me to expect peaks during morning and evening rush hours, which can vary from city to city. Let’s take a look at the data for a single day in Austin, Texas.

Wednesday, April 11, 2018:

  • Non-holiday weekday
  • 1+ week away from holidays
  • Middle of work week
Figure 4: TxTag Prices for Wednesday, April 11, 2018

Observations

  • There are price peaks during the morning and early evening hours.
  • Surprisingly, there is no morning price peak for the Northbound direction.
  • For the majority of the time, prices are below $1.00, indicating lower congestion in the express lanes.
  • During peak times, prices exceed $8.00, indicating much higher congestion.

Aside from the lack of a price peak during the morning in the Northbound direction, these observations align with my expectations for highway congestion in general and Mopac Expressway congestion specifically; however this is only one day. Do these observations hold for all days?

All Days

Figure 5: Northbound TxTag Prices for January-September 2018
Figure 6: Southbound TxTag Prices for January-September 2018

Observations

  • The price peaks observed for April 11th, are present in similar time intervals across the entire time interval of January-September 2018.
  • There are many outliers above and below the medians for each time recording, indicating atypical traffic congestion.

The box plot data shows that the peaks originally shown by the line chart for April 11th (Figure 4) is characteristic for the majority of days in the data set. The outliers on either side of the median, however, are interesting and worth investigating to determine if they are one-time or systematic.

Do the toll prices show price patterns across weeks or months?

To better visualize the price behavior over the weeks and months, the data has been transformed into heat maps, with days and months plotted against the time of day and a color scale showing the averaged prices.

Prices throughout the week, by day

Figure 7: Northbound TxTag Prices Heat Map, averaged across each day
Figure 8: Southbound TxTag Prices Heat Map, averaged across each day

Observations

  • Saturday and Sunday (days 5 and 6) do not have price peaks as weekdays have.
  • From Monday to Friday, the prices for the evening rush hour rise gradually earlier by day.

Though it is not surprising that the prices are much lower during the weekend, it is interesting to see the congestion start earlier as the days go by during the work week. While this may not be necessarily true for every week, the averaged data suggest an overall trend for the time interval. One possible explanation for this observation is that workers are leaving work earlier and earlier as the weekend approaches.

Prices throughout the year, by month

Figure 9: Northbound TxTag Prices Heat Map, averaged across each month
Figure 10: Southbound TxTag Prices Heat Map, averaged across each month

Observations

  • January (month 1) has the lowest averaged prices for both Northbound and Southbound directions.
  • Evening peak price periods are shifted later in the day by 30 minutes to an hour for September.

Lower prices in January are explained by the tail end of the winter holidays (Christmas, New Year’s). Workers and students are often on vacation, either staying home and off the roads or traveling out of town. The shift in congestion to later in the day for September, however, is unexpected but interesting. Initial research did not produce any definitive answers, but this is worth future investigation.

Are there atypical days and are they one-off or systematic?

In the previous section, averaged prices were plotted in order to better visualize trends over days and months; however it is important to also analyze the extreme values to check if their recurrence can be predicted.

IQR

For this analysis, we use the IQR, or the interquartile range, to determine the outliers. IQR is a measure of variability while quartiles equally divide sorted data into four parts.

The components of the IQR, labeled in the diagram below, are visualized as box and whisker plots, which have already been utilized in this study (see Figures 5 and 6).

Figure 11: Sample Boxplot, IQR components labeled

Weekday Outliers, max prices per day

Taking the maximum price per date, computing Q1, Q3 and the minimum and maximum limits, and finally applying those limits to the data set results in the following:

Figure 12: IQR for Max Prices Per Day
Figure 13: Low Outliers for Max Prices Per Day

Dates with max prices of $0.50 indicate no price peaks for the entire day, and these coincide mostly with federal holidays, during which there are less vehicles on the road as commuters stay home to celebrate or rest:

  • 2018–01–01: New Year’s Day
  • 2018–01–16: 1 day after Martin Luther King Jr. Day
  • 2018–05–28: Memorial Day
  • 2018–07–04: Fourth of July / Independence Day
  • 2018–09–03: Labor Day

Dates with max prices greater than $0.50 indicate price peaks during the day, but the congestion is low relative to the population of data points. The dates in the list are either adjacent to holidays and popular long weekends or during a school break. One date, June 26, has no obvious explanation and requires further research.

  • 2018–01–02 / 2018–01–03: vacationers gradually returning to work after winter holidays
  • 2018–01–15: Martin Luther King Jr. Day, celebrants stay home or attend events in the city
  • 2018–02–19: President’s Day, not universally observed, but popular long weekend
  • 2018–03–15: Spring Break, SxSW Festival (student population on vacation but slightly offset by influx of festival attendees)
  • 2018–03–30: Good Friday, popular long weekend
  • 2018–04–03: 2 days after Easter Weekend, popular long weekend
  • 2018–06–26: requires further research

Weekday Outliers, single time block

Note that the calculations in the previous section produced only low outliers. We can also find outliers for a given time block, specifically during known peak times such as 08:30 Southbound, which has both high and low outliers as shown in Figure 6 previously.

Repeating the calculations for this data subset, we get a low limit of 2.985, a high limit of 9.425, and the outlier dates for 08:30 Southbound below:

Figure 14: Low and High Outliers for 08:30 Southbound

In determining the low extreme values, prices $0.50 and below were excluded to focus on non-holiday outliers. Even so, the impact of holidays can still be observed in the resulting outlier dates.

low outliers

  • 2018–01–02 — 2018–01–05: vacationers gradually returning to work after winter holidays
  • 2018–02–07: inclement weather, workers advised to stay off roads / work from home
  • 2018–02–19: President’s Day, not universally observed, but popular long weekend
  • 2018–03–12 / 2018–03–13: Spring Break, SxSW Festival
  • 2018–05–28: Friday before Memorial Day weekend, popular long weekend
  • 2018–07–05: day after Fourth of July / Independence Day

high outliers

  • No national holidays, special events, or influx of visitors were found for these dates.
  • Austin traffic data yielded reports of crashes and lane closures during these dates and times.

Outliers in this study are worth the scrutiny as they provide insight into what holidays are celebrated, what special events are scheduled, and vacation trends in the workforce and student population among others. While only holidays and recurring events can be predicted, one-off occurrences such as automotive accidents or inclement weather can still prove valuable if they correlate to outliers in toll prices.

Conclusions & Future Research

Based on the analysis in this study, the toll prices have proven to be a useful alternative to actual traffic data. Patterns in the data aligned with personal observations regarding peak traffic hours while outliers mapped to real-world events such as holidays, vacation travel, and car accidents.

Without mapping the toll prices to real-time congestion data, however, we cannot be sure how accurately the prices align with traffic conditions or if there is bias in the algorithms that raise or lower prices in response to congestion.

I propose future research in which this mapping is executed and analyzed. If the study produces enough confidence in the alignment between congestion and toll prices, then I propose using the toll prices along with machine learning algorithms to attempt prediction of future traffic conditions.

Link to analysis scripts.

Note: Analysis code can be found on GitHub.

--

--

Madison John
Madison John

husband. father. enginerd. not necessarily in that order.