Exploring and Visualizing Variation in Accessibility Measures

Published in

Conveyal

6 min readNov 16, 2016

Here at Conveyal, we use accessibility indicators — measures of how many jobs or other opportunities can be reached within a given travel time — in much of our analysis work. Our Scenario Editor tool allows creating public transportation scenarios and quickly seeing the impacts of these scenarios on the accessibility from different locations throughout a metropolitan region. However, when working with public transportation, there is no one number for accessibility. The number of jobs you can reach within, say, an hour, varies depending on when you leave your home. You might just catch — or just miss — a transit vehicle, or there might be a service that does not run all the time.

We address this problem by calculating the travel times and accessibility for every minute one could depart during the peak period (e.g. 7:00, 7:01…). We can analyze any period that is of interest, not just the peak. When there are routes that do not yet have a timetable (because they are still being planned), we generate a large number of random schedules based on the planned frequency of the route, and evaluate each of them individually. This produces a huge amount of data—the travel time to every point in a metropolitan region via transit, for every minute of departure and randomized schedule during the morning peak.

Previously, we computed the best case, worst case and average accessibility as a way to summarize this barrage of data. Of course, in dense urban areas with high-frequency transit, there is minimal variation in travel times over the time window, and the best case, worst case, and average are very similar. Further from the core, where transit services are infrequent and the exact moment of departure can determine whether you manage to walk out of your door and catch a bus or sit at the stop for an hour, there is much more spread between the best and worst cases.

In a previous post, we illustrated this with an example comparing accessibility in urban downtown Seattle to suburban Woodinville, Washington, USA. On the left we have the graph of the number of jobs accessible within various travel times by public transportation from downtown Seattle, while on the right we have the same graph for Woodinville. The blue lines are the average accessibility as a function of travel time, while the gray area represents the area between the best case scenario and the worst case scenario during that time window. In the Seattle case, the variation is small; the transit service is fairly reliable, and you can generally reach about the same number of jobs regardless of when you leave. In Woodinville, by contrast, there is tremendous variation between the best and worst case scenarios, because you may just miss a bus and add an hour to your trip.

Jobs reachable by transit within various travel times, for Seattle (left) and Woodinville (right)

This is very interesting, but these three numbers (best case, worst case, and average) hide a much richer picture of the accessibility provided by the transit system. Instead of representing the variation in accessibility through a few summary statistics, the latest version of our analysis platform displays this variation as a full statistical distribution, allowing much richer explorations of the data. For instance, one could ask what the 85th percentile accessibility is, or the probability that one transport scenario will yield better accessibility than another for any combination of departure times.

With all of this data, however, we need to find a way to present it in a visually coherent way. We’ve created a visualization we’re calling a accessibility “spectrogram,” inspired by the visualization of the same name used in audio processing. An accessibility spectrogram is shown below for that same location in downtown Seattle.

Job accessibility spectrogram from downtown Seattle

The first difference you will notice relative the earlier plots is that there are no lines on this plot. Rather, at each minute, you see the complete distribution of accessibility, binned into chunks and colored according to probability, with brighter colors indicating higher probability. For instance, for a 60 minute trip (the location of the vertical line in this image), we see a high probability (dark blue/black) of being able to access 700,000 jobs, and a somewhat lower probability (light blue) of being able to access only 600,000 jobs. It is also rare (light blue) to have access to more than 800,000 jobs within 60 minutes. That is to say, if you leave from downtown Seattle at any time during the morning peak, you will most likely be able to reach the locations of 700,000 jobs within 60 minutes, but it is possible you will be able to reach 800,000, or only be able to reach 600,000. The colors represent probabilities of a particular outcome, with the legend on the right.

If we’re only interested in trips of a particular duration, we can take a vertical “slice” of this spectrogram and present the data in a much more familiar form, a histogram.

Histogram of potential job access within 60 minutes from downtown Seattle

Here, we see that within a travel time of 60 minutes in the morning peak, there are between 700,000 and 950,000 jobs accessible, depending on your exact moment of departure, with a sharp peak at 800,000. The height of the bars represents the same thing as the colors in the previous plot: taller bars indicate outcomes that are more likely.

We can create these same visualizations for the suburban location mentioned above, Woodinville:

Job accessibility spectrogram from Woodinville

As before, the variation in Woodinville is much larger. At very low and very high travel times, the variation is small, either because you can’t get to very much except by walking, or because all of the job centers in the region can be reached within a given amount of time, even if you just miss the bus and have a long wait. We can again display a vertical “slice” of the spectrogram at a 60-minute travel time.

Histogram of number of jobs accessible within a 60-minute transit commute from Woodinville

This particular location is illustrative of why the average is not always an informative metric. The distribution depicted in the histogram does not have a clear representative average; there are several separate peaks and a generally asymmetric, non-normal distribution. If we were to represent this location with an average accessibility, we would be hiding valuable information.

This additional information allows a much more nuanced picture of accessibility from a particular point. Suppose for a moment that the two locations depicted above — Seattle and Woodinville — have the same average accessibility. Even though they are identical by that measure, clearly the user experience is much more consistent and reliable in Seattle than in Woodinville; there is much less variation in the experienced accessibility in Seattle.

One obvious application of the additional information presented by the spectrogram is performing probabilistic comparisons of transit scenarios. Rather than just looking at whether the average accessibility is higher under a proposed transit system than it is under the current, we can look at the probability a system user will experience a higher accessibility. For example, if the average is higher but 40% of the time the new system underperforms relative to the existing system, it’s questionable how valuable the change is. This is particularly true when evaluating a new system using randomized timetables; depending on how exactly the final timetable is written, a system that shows an increase in average accessibility could actually produce a worse customer experience.

Exploring and Visualizing Variation in Accessibility Measures

Written by Matthew Wigginton Conway