From the Red Line
Published in

From the Red Line

The power of data

As people use apps and other online tools, the LTA needs to catch up.

Apart from breaking (albeit briefly) when I updated my phone to Android 12, the LTA MyTransport app is decent. Not great, not terrible, and could definitely use a lot of improvement especially with the crashes.

But that’s not the main point. The main issue is that the data that matters isn’t really provided to many others or in the places that matter, which introduces complications when it comes to using other apps to plan a journey.

On conjuration

Google Maps, for one, largely appear to draw up their own schedules. It’s quite clear that the travel times proposed by Google Maps don’t match the actual display boards or what is displayed on the SMRTConnect app, based on a quick survey.

What this means is that Google Maps’ assumptions become faulty when people use it to look up directions on public transport. It may end up assuming trains and buses are slower or faster than they actually are, which could result in recommending a train ride where a bus might get there faster (yes, such scenarios exist) or vice versa.

I smell a rat

In the above image you can see a side by side comparison of Google Maps live arrival times compared to what is shown by SMRT’s train timings website. That latter website is the same information source powering the SMRTConnect app and other websites and apps that may draw information from it — such as, perhaps, Moovit. As you can see, on the NSL at Yew Tee, they already don’t really match. Granted, Google Maps shows “scheduled” trains, but if they don’t have the schedule to begin with, they can’t really be accurate.

In this case, it’s not that the data is unusable. And the fact that we mostly run trains at six minutes in between — at most, in normal service — means that even if you do miss a train due to mismatch of data, you can probably wait for the next train without too much disruption to your routine.

It gets worse

As we now see, on the Circle Line and Thomson-East Coast Line, neither SMRT nor Google Maps can guarantee the same arrival timings as shown on in-station displays, which means it may not even tally when it comes to route planning. Most notably, SMRT’s portal only shows “scheduled arrival time”, which means delays and other incidents along the line cannot be accounted for. Live timings may be among the best, but with a certain degree of reliability, scheduled timings may be acceptable. Never look a gift horse in the mouth.

SMRT’s efforts to attempt to pull these data out of its IT systems as a user experience benefit should be commended, though. SBS Transit doesn’t even care, and this information isn’t even available for LRT lines as well. This means that Google is forced to make assumptions about the type of transit service that is provided on the lines without live arrival timings, resulting in the general lower quality of information seen here. Feel free to repeat these experiments if you’re still not convinced.

Well, if you use Citymapper or Google Maps on desktop, they just say “every X minutes”.

It’s been done

They’ve actually gotten this right on buses. The existing Bus Arrival API available on LTA DataMall is a highly comprehensive system that even gets you the live location of a given bus, so long as that bus is within a few stops of the bus stop being queried. That last system limitation may be acceptable if the goal is to manage system load, but that might be a limitation of LTA’s own data scraping methods.

But there might be alternatives. If they considered to use GTFS, it might be possible to expand the functionality of the bus trackers, by showing the location of buses on a per-bus basis instead of a per-bus stop basis. Or to use another feature of GTFS to properly model the impact of interchanges so that the real shortest route can be suggested, instead of the crude pathfinding algorithms used within the MyTransport app.

And of course, lest I state the obvious, the presence of train arrival screens on platforms in the first place means the data exists in the first place. Furthermore, an initiative to build out passenger load measurement systems was rolled out on the DTL and may be expanded to other lines, we will see how it goes. Still, apart from the NSEWL example mentioned earlier, much of this information is kept in a closed loop within the railway systems. And since the railway systems are provided by the LTA to the operators under NRFF, they are the ones whom will have to incorporate such data exchange mechanisms into the system requirements when undertaking renewal projects or building new lines.

Plus, they control DataMall, and already have the LTOC as a central clearinghouse for public transport crisis management — the information can be funneled through LTOC as a “peacetime” role. The alternative to doing this is that you could be Govtech and tape Raspberry Pis to a cabinet (or something) inside the train, and use that Raspberry Pi to detect Wifi signals inside stations. Then again, why do that when all the data can simply be extracted from the signalling systems, since all of it has to be made available for operations control to do their job anyway? If cybersecurity is a concern, SMRT has already led the way and they clearly know what to do.

Quality over quantity?

Unlike MTR, though, we do make such train timing information available at exits and concourse/transfer areas, whilst MTR only has them at platform level. We can go even further than them, and with the requisite data sharing mechanisms in place, transfer information such as arrival times of next trains on other lines can also be shown on the in-train LCD information panels or something. This can happen apart from the obvious such as officially-declared disruption information, so those without app access can still benefit.

The issue with this, though, is that the increasing distance between exits, concourses, transfers, and platforms within newer stations might produce significant travel time on foot between the entry point and the platform itself. The presence of such signs may cause someone to run down escalators or similar in the hopes they might make it in time to the platform to catch a departing train. Most of us may have been guilty of this.

The last concern is of course someone repeating my experiment and finding “disappearing trains”. This is more common on the TEL — just several days ago I saw the screens saying “10 mins” and “19 minutes”, where is the first train since TEL trains run 9 minutes apart off peak? I personally guess that this could be a behavioral change to discourage people from running down escalators. There’s a low chance that you might be able to run fast enough to get that train that’s arriving in 1 minute, without breaking something along the way — what, I leave to the common imagination. But of course, with the platform-level data available on smartphones and such, it might defeat the point of such behavioral tweaks.

Multimodal integration can also be improved. At 300 bus stops new bus waiting time displays have been introduced. It could be possible to put these new displays inside MRT stations as well, and where they can include more information apart from bus arrival times. For example, things like the nearest bus stop from which a bus service can be boarded can be shown, and perhaps during disruption they could also shift to show alternatives for common routes including buses and alternate MRT lines. This might be a good replacement for the many signs and information panels which may not be updated very judiciously.

International examples

So what are other people doing?

RapidKL, lousy as they are in actual operations, is signing data exchange deals with various “Mobility-as-a-Service” providers in order to provide their public data exchange feeds. This could be just an intern converting RapidKL’s own data feeds to something the providers can understand; and this may also be what is already done for buses here. Likewise, MTR may have done so (or MTR Service Update, at least)

Taipei Metro and Japanese operators go a step forward. Their mobile apps include features that show you which station a train has passed, and the rough arrival time of the train at a given stop, including the impact of delays compared against the publicly posted schedule. Things like layout maps that help you navigate their stations are also included in their apps. And that’s even considering that Taipei in general does not have complex stations like we do.

The gold standard may be Washington DC, maybe from a technical perspective at least. The Washington DC Metro system offers live train positions based on track circuit occupancies, amongst other information to easily track trains. It makes it easy to develop applications like DCMetroHero, the data source behind IsMetroOnFire. Of course, the fact that the latter website has to exist speaks more about how poorly their system is run, but the statistics offered by the former may show a far better picture of improving rail reliability efforts.

For us, such information might be better in showing the big picture of improved rail reliability, and not abstracting everything down to the hard-to-understand number that is the Mean Kilometres Between Failure metric. Results matter, and if people see that the trains and buses arrive exactly when they should, they may be more positively inclined towards using public transit.

The long story short is this: for public transport to remain competitive, the data needs to be more accessible, especially as other platform companies make themselves visible on the common open markets (Google Maps has sponsored links from Grab, come on). But as pointed out by the LTA themselves, accessibility itself does not come solely from the use of an API and other apps, the user endpoints also have to be provided for those who cannot or might not want to work their phones to get the information.

With accessible data, people can then make a more informed choice on using public transport, compared to just calling a car. And while we’re on that, maybe some effort to actually keep the lift maintenance feature of the MyTransport app updated with what actually happens on the ground might be good, but we’ll see.

A blog on transport issues in the Garden City of Singapore. You can say that I love controversy. Posts can get technical! Abuse of comments may be blocked. Subscribe to Telegram for updates:

Recommended from Medium

Into the data: Day 24

Into the data: Day 24

The making of a data scientist

Make Your Dashboard Stand Out — Fill Percentage Ball Chart

Relevance of Big Data across Industry

How to Love jsonl — using JSON Line Format in your Workflow

Confusion Matrix for classification Model Evaluation & Monitoring

US Unemployment Rate Analysis with Python

Who is your all time favorite pro wrestler?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


Sometimes I am who I am, but sometimes I am not who I am not.

More from Medium

Let’s put people at the front of data strategy, not at the end.

AZURE Resource Manager

Process Mining: the hole is in Data Preparation?

You’re leading People Analytics: Now what? Strategy & Implementation