Numbers, numbers, numbers…

This news article from a while back made me laugh:

Well, if not for the odd hour-long train disruption or so and perhaps the odd hundred thousand pissed off commuters, I would guess they’re on point. But let’s get behind the numbers to understand it. I’m no statistician, so please bear with me.

I know I said last week that I expected to continue my blog series on the DTL, but this just seemed too important to miss, following recent happenings. So today, instead of talking boring technical details about the DTL, I’ll try to analyze the recent news and provide some additional context as far as my limited knowledge allows.

Firstly, as the article states, more trains have been added to the North-East and Circle lines (25 to 43 on the NEL and 40 to 64 on the CCL), bringing waiting times down rather noticeably. This is because the operators can now run a more intensive service since they have more trains. Wait, that’s an obvious point, I probably don’t need to say it.

Secondly, this is from July 27’s Straits Times:

This is LTA’s Rail Reliability Report for the first half of the year 2017. While I have my own suspicions (don’t get me wrong, I’m not here to accuse LTA or the operators of “cooking the books”, so to speak), it does help to illustrate that yes, the situation is getting better. Especially if you live along the North-East or Circle lines.

First, let’s define some terms. “Mean distance between failures” is defined as the total amount of distance travelled by the entire fleet on a particular train line before something happens. Note that I say “entire fleet”. Let’s get this down to an easier number.

Let’s say we have Line X. Line X is 10 kilometers long, and its trains make 200 round trips a day. So, the total distance travelled by all trains on Line X in a single day would be 200 x 10 x 2 = 4000 km a day.

Say the fleet of Line X has a MDBF of 36,000km. 36,000 divided by 4000 gets you 9. This means that on average, a failure (of whatever sort, be it rolling stock, signaling, power, platform doors, or anything, really) happens on Line X every 9 days. I believe this is what the regular commuter wants to hear.

Thus, as we’ve seen, in order to understand these statistics, we need to put all these numbers in context. In my opinion, LTA’s infosets are missing two things — the overall distance travelled, and the 12-month rolling average.

Compare this MDBF data from the New York Subway:

The numbers themselves aren’t good, but the way it’s presented is. One can see very clearly how much the fleet travelled in a survey period, how many delays happened, and the final MDBF number, thus putting the provided statistic in a very clear context.

What do the numbers mean?

Even without the contextual information mentioned above, we can pretty much guess the relative performance of a line against another.

For example, assuming that both lines have the same “regularity” of incidents (say, one every three days, for example) the CCL, as a longer line, should have a higher MDBF rating than the DTL, which is shorter (at least until October 21). The fact that the CCL and DTL have very close MDBF numbers in this sample shows that the CCL, from a commuter’s perspective, breaks down more often.

Let’s stop to also look at LTA’s most recent release for the first half of 2017 (we’ll look only at the NEL and CCL, to illustrate as examples):

  • 1Q 2017: NEL 973,000, CCL 452,000
  • 1H 2017: NEL 978,000, CCL 518,000
  • 2Q 2017 (by estimation assuming the 1H2017 rating includes only 1Q and 2Q 2017, don’t quote me on this): NEL 983,000, CCL 584,000

As you can see, the NEL managed to maintain its MDBF rating between 1Q and 2Q 2017, achieving an average of 978,000 between the two quarters. The CCL, on the other hand, actually got more reliable in the second quarter, increasing from 452,000 to 584,000, and getting an average of 518,000 between the two quarters.

So what does this mean? While the NEL has managed to hold steady, reliability of equipment has actually improved on the CCL, believe it or not. Again, as I often say on this blog, criticism where it’s due, but credit also where it should be given.

“Ignorance is strength”

The problem with calculating this way, though, is that only one part of the story is being told. We’re not being lied to, but we’re also not privy to all the facts.

Statistically, a large 4-hour power cut impacting the entire line would matter as much as the doors failing to close at one station and causing a five-minute delay. It’s the former that’s more easily felt and makes us commuters pissed off, while not so much the latter. Why?

Reliability is a big word, not only encompassing how well the machines are working, but also how well they’re being put to work. In my opinion, what matters more to the commuting public would be:

  • on-time performance — what percentage of trains are able to reach their terminals on time.
  • excess wait time — how long one has to wait between trains.
  • And of course, general system uptime as a percentage of total hours expected in operation.

These are metrics LTA are already very well familiar with — it’s used to evaluate performance of bus operators under the new Bus Contracting Model. So why not extend this to the MRT as well?

The New York MTA, which runs the subway and a few other commuter railways in the greater New York area as well, use at least a combination of all three metrics in order to evaluate how their railways are performing — you can see an example here. So does Network Rail, which manages the UK’s national railways in a system not unlike ours, here.

The Mean Distance between Failures metric only tells you how well the maintenance engineers are working. It does not evaluate how well the service controllers are doing their jobs, or the train drivers, or the rest of the entire team that work to get the service running. And I think, to be fair, the system must be evaluated as a whole, by seeing how often the trains are running on time.

“Ownself check ownself”

Yes, I know that’s a very tired phrase that we’re all accustomed to hearing, but it’s pretty apt here. LTA are using a cherry-picked metric, favourable to their narrative, to state that they’re doing their job and they’re doing it well. For that, as I have said, we must give credit, but there are other sides to the story that the public can see no remedy for.

The operators themselves are also members of various industry groups, which have very stringent benchmarking metrics, of which all members of the group, including names like the MTR and the Taipei Metro, are regularly evaluated against. And it is the operator that also catches shit from the commuting public when things go wrong, while LTA as the regulator plays a role out of the public eye, leaving the operators to sort out the problems while claiming credit for successes made. The real victims here, in my opinion, are the operators, who are dealt bad hands by the authorities and have to make the most of it.

So perhaps the operators could, independently, publish their own side of the story. They can show us the comprehensive assessments made by these independent industry groups, which I am fairly sure provide a far better perspective of the quality of service that LTA’s released metrics show. Such independent industry groups would also be familiar with the renewal efforts pursued by other systems worldwide, and can likewise give us a grade on how well we are performing in that relative to other systems as well.

I’m sure many of us have fallen victim to the “grass is greener” syndrome, I myself have. But of course, the London Underground’s Jubilee Line upgrade was hilariously poorly managed, for one, and with the constant managerial issues plaguing the New York Subway since the time of Nixon, I’m sure they’d probably think better of our system compared to theirs.

Of course, while one could say that sheer numbers by themselves can easily act as a political smokescreen while the shitshow goes on, I think that that’s the best way to show concrete progress being made. And these numbers must also be accompanied by visible improvement, naturally. The public do not want to be blinded by numbers, they want to see the trains run on time.

Without a doubt, I believe that everyone understands the economic disruptions that happen when public transport stops being reliable. People are late for work, they miss interviews, maybe even high-value business opportunities. And when this personally affects the man on the street, he loses confidence in the system.

“But you don’t know anything!”

Of course I don’t know anything, like what was recently pointed out by the decision makers. I’m not even part of the press, I’m just a blogger on the Internet, and a very opinionated one at that. As I’ve said before, I don’t think that it’s possible for one to just bring over technical know-how from overseas, and that I feel it important for us to also learn from the best practices of other agencies.

Likewise, I think it’s important that the train operating companies take an all-in, holistic approach towards improving system reliability. Not only must the machines work, and work well, it’s also important that the people who run the system also know what they’re doing.

Only then will it be possible to reduce the grief felt by the commuter, who must account for an unwanted element of surprise when making his journeys. And only then, once the commuter can see the trains running on time and as expected, will he be able to regain trust in the company. Trust is a fickle thing, easily lost but difficult to gain, and that’s what I know. People are asking questions, and those questions must be answered.

Producing reports and statistics do not mean much to an irate, displeased public when incidents still happen near-daily, but we have to start somewhere. It’s still better than empty words that attempt to pacify the situation without actually doing anything.

A quote that I’ve heard somewhere, probably from Minister Khaw himself, is that “just because we can be content with mediocrity, doesn’t mean we can’t strive for excellence”. I know the work of the transport operators is difficult, and I applaud them for trying to make the most out of a difficult situation (as I’ve underlined in previous posts), but as anyone knows, it can take a “push” to spur the entire ecosystem to work harder.

And through this blog, I hope to be able to give that “push”.

Comments welcome.