Analysing Bus Arrival Times — Findings

4 min readNov 20, 2017

I use bus apps to check bus arrival times so that I can walk quickly to the bus stop to catch the arriving bus. Most of the time, I will make it but occasionally, 2 things happen:

I rush to the bus stop and see the bus leaving — the ‘just missed the bus’ phenomenon
I wait at the bus stop for the next couple of minutes seeing the ‘bus arriving’ message — the ‘bus arriving’ phenomenon

Curious, I analysed the bus arrival times available from the Application Programming Interface (API) at the LTA DataMall to look for clues. I found possible reasons behind the phenomenons and something more.

Setup

Let’s start with an illustration using 2 bus stops and a simple formula.

[1] Journey time = Arrival at #2 — Arrival at #1

Now, let’s look at the actual data captured on 12 Nov 2017 for bus service 83 between 2 stops. I am able to ascertain the traffic conditions (i.e. smooth or heavy) through visual confirmation.

Smooth Traffic

Let’s analyse timings at stop #1:

At 8:48:04, the bus is predicted to reach at 8:48:37.
At 8:49:03, the bus is predicted to reach at 8:48:14. Technically this timing is backdated and should be classified as reporting rather than predicting.
At 8:50:03, the next bus is predicted to reach at 9:00:54.

Interestingly, through my research, I discovered that a backdated arrival time quite accurately* reports the time a bus is actually at a stop (i.e. actual arrival time). Furthermore, the latest backdated arrival time is often the most accurate.

*90% of the time, backdated arrival times are within 60s of the actual arrival times

In other words, backdated arrival times are good estimates but are not timely; it does not help to know at 8:49:03 that the bus was at the stop at 8:48:14. Ideally, a bus app should disregard such ‘backdated’ timings. If not, you will see the ‘just missed the bus’ phenomenon.

Extending the same principles to bus stop #2, you can see a similar pattern.

With this knowledge, we can break down [1] further,

[1a] Predicted journey time 
= predicted arrival at #2 — predicted arrival at #1[1b] Estimated journey time 
= backdated arrival at #2 — backdated arrival at #1

Remember, backdated times are more accurate so [1b] is more accurate than [1a].

From [1a], we can calculate that the predicted journey time is a flat 127s. I suppose this is probably determined by LTA using its arsenal of tools available under the Intelligent Transport System.

From [1b], we can determine the estimated journey time as only 89s (08:50:06 minus 08:48:37).

Pictorially,

Heavy Traffic

Let’s highlight backdated timings at stop #1:

At 15:01:05, bus is reported to have arrived at 15:00:16.
At 15:02:04, bus is reported to have arrived at 15:00:41 (i.e. best backdated arrival time).

Now look at the timings at stop #2. Firstly, the predicted journey time of 146s between the two stops have increased slightly from 127s. LTA probably obtained some feedback from its sources about the heavy traffic and added some buffers to it. Secondly, the arrival times at #2 are being pushed back repeatedly, which conceptually, gives us the dreaded ‘bus arriving’ phenomenon.

Again, note these backdated timings at #2:

At 15:03:04, the bus is reported to have arrived at 15:02:53.
At 15:04:04, the bus is reported to have arrived at 15:03:40.
At 15:05:05, the bus is reported to have arrived at 15:04:18.
At 15:06:05, the bus is reported to have arrived at 15:05:19 (i.e. best backdated arrival time)

Hence from [1b], the bus took 278s (15:05:19 minus 15:00:41) to travel between the same stops during heavy traffic. Once again, the predicted journey time [1a] of 146s is pretty off.

Yes, pictorially,

Constraint

When buses of the same service arrive in close proximity (i.e. bus bunching), arrival times are affected and interpreting [1b] becomes a challenge.

Conclusion

I have discovered that while backdated arrival times might not be suitable for bus apps, I found them good estimates for recording the arrival times of buses at stops and in turn, preferred parameters for calculating estimated journey times.

In fact, we can actually use them to infer the traffic conditions along specific stretch of roads where buses ply. This is the premise of Traffico (https://www.maventechnologies.com.sg/traffico). In the next article, I will discuss the development of this tool.