A race to the platform

Tim Huckle
Sep 5, 2018 · 8 min read

London Euston is a stressful railway station at the best of times. Eighteen platforms from which the trains on the route I frequently catch — Virgin West Coast to Manchester — have been known to depart from any one of them.

I typically take the train up to Manchester on a weekday evening and the first off-peak trains are normally crowded. In fact, most trains departing from Euston on this route are busy enough that having an advantage by knowing the platform to head for before hundreds of others do is the difference between being able to choose a seat or not having one at all. Reserving a seat isn’t an option in my case — I use open-return tickets for flexibility.

There is no pattern to which train leaves from which platform. Seasoned travellers have no real advantage other than an increasing number that rely on feverishly refreshing the screens in apps such as ‘Live Trains’ in an attempt to gain information before the station boards and announcements are made… at which point the scrum begins.

The app junkies normally ‘win’. Over time I’ve witnessed an ever-increasing number of travellers forming orderly queues at the platform gate before the masses start their jog for a seat.

I’d estimate the 19:00hrs departure regularly has fifty or more people like me who are ahead of the scrum thereby guaranteeing themselves an unreserved seat in the coach and a half typically available to us. We can be observed continually looking down at our screens 25 minutes or so before the next scheduled departure. 20 minutes prior we should know, 19 minutes prior everyone else knows. It’s that tight, seconds count.

Many would compare our mentality to the ‘7am beach towel on a sun-lounger crowd’. Perhaps.

But having been sent to the wrong platform by an app on my most recent trip, it got me thinking.

On that journey up North I decided to see if I could gain an advantage through technology and a bit of rapid development. In the two hours and twenty minutes I had sat at my hard fought for table seat I cobbled together something:-

  1. All apps use the same data

It’s widely known the national rail timetable and real-time schedule data is centralised. A couple of minutes on Google and this was confirmed, a system called Darwin delivered by the Rail Delivery Group was what I was looking for.

2. Sign up on https://datafeeds.nationalrail.co.uk/

I filled out a form outlining my intended use (personal travel information), chose the Real-Time Data Feeds as my subscription type, and within a minute I had my access confirmed — all free of charge.

3. Read FAQs and developer docs

My proof of concept requirements were pretty simple, I skimmed over the docs and data format types for 15 minutes or so.

4. Spin up an EC2 instance in AWS

I didn’t have anything running on my personal AWS account readily accessible so I span up a micro instance and installed php7 cli. From this point on I was pretty much editing in VIM via SSH through a 4G EE wifi dongle. It dropped out quite a bit but was manageable.

5. Connect and explore timetable data (FTP)

The basis of all real-time data is of course the scheduled timetable. This is published daily in XML format on an FTP service, from which I found the necessary pattern to identify the ONLY route I’m interested in.

<Journey rid=”201809046770589" uid=”C70589" trainId=”1H73" ssd=”2018–09–04" toc=”VT” trainCat=”XX”>
<OR tpl=”EUSTON” act=”TB” ptd=”18:40" wtd=”18:40" />
<PP tpl=”CMDNSTH” wtp=”18:42:30" />
<PP tpl=”CMDNJN” wtp=”18:43" />
<PP tpl=”WLSDWLJ” wtp=”18:46" />
<PP tpl=”WMBY” plat=”3" wtp=”18:47:30" />
… lots of intermediary station and signal point information…until:-
<DT tpl=”MNCRPIC” act=”TF” plat=”5" pta=”20:52" wta=”20:52" />
</Journey>

With the timetable data I extracted the ‘uid’ and planned departure times (‘ptd’) of all the trains leaving Euston (EUSTON) and arriving at Manchester (MNCRPIC).

It seems the uid’s are the same from one day to the next, I assume until such time that the operator changes the timetable schedule, but I had enough for a proof of concept. Ultimately if it works out I figured I’ll database what I need and reload the feed daily just to be sure.

6. Connect and review the real-time feed

With the uid of forthcoming trains that I’m likely to be interested in I switched attention to the push based real-time data feed — ‘Darwin’. This feed requires a STOMP client, of which there are several.

I chose https://pecl.php.net/package/stomp and installed via the PECL package manager.

Connecting to the message broker is made using the account information provided via the account / feeds section on https://datafeeds.nationalrail.co.uk/, the same page as the FTP info. The connection initialisation will look something like:-

// Network Rail Stomp Handler$server = “tcp://datafeeds.nationalrail.co.uk:61613”;
$user = “username”;
$password = “password”;
$channel = “your queue name”;

$con = new Stomp($server, $user, $password);

if (!$con) {
die(‘Connection failed: ‘ . stomp_connect_error());
}
$con->subscribe(“/queue/” . $channel);
while ($con) {
whatever we want to do with each message here
}

In my case I figured I could simply start up this feed consumer as and when I want to be informed about train/platform information shortly before I arrive at Euston or at the station itself. I’ll write a private web front-end to this if the POC works out.

The Darwin queue holds the last 5 minutes of data for me if the feed consumer disconnects; but I don’t particularly care about persisting a connection throughout the day — I have no interest in the data when I’m not planning to travel this route.

7. Process the incoming messages

The feed pushes hundreds possibly thousands of messages a minute. My need is simple, so I can throw nearly all of it away.

My observer process starts by retrieving the uids for the 3 forthcoming trains departing 20 minutes or more from datetime(now). I’m not interested in knowing about trains departing sooner because everyone will have the same information already and it’s likely the train is already boarding. Looking ahead for the next 3 trains allows me to ignore one or two should the nearby pub seem more interesting that evening.

Each feed item comes through as an array, I process them through a while loop; the message body of each is gzip compressed. I uncompress it with:-

$messagebody = gzdecode($msg->body);

The uncompressed $messagebody contains the information I need, e.g.

<?xml version=”1.0" encoding=”UTF-8"?><Pport xmlns=”http://www.thalesgroup.com/rtti/PushPort/v12" xmlns:ns3=”http://www.thalesgroup.com/rtti/PushPort/Forecasts/v2" ts=”2018–09–04T18:43:46.9270755+01:00" version=”12.0"><uR requestID=”0000000000005193" requestSource=”at08" updateOrigin=”CIS”><TS rid=”201809046770476" ssd=”2018–09–04" uid=”C70476"><ns3:Location ptd=”19:00" tpl=”EUSTON” wtd=”19:00"><ns3:dep et=”19:00" src=”TD”/><ns3:plat conf=”true” platsrc=”A”>13</ns3:plat></ns3:Location></TS></uR></Pport>

In this case the 19:00hrs is confirmed to be leaving from platform 13.

I can therefore do a simple string compare on the body of the message to see if it contains one of the train uid codes.

if (strpos($messagebody, $trainid) !== false) {
do something to alert here
}

It’s without doubt not the fastest way. I can probably create a custom queue at national rail with just my departing station or train operator but I’ll look more in to that later. If I’m purist about it then using PHP; uncompressing every message body perhaps unnecessarily if I spent more time understanding the message headers; and the string compare on a low spec server instance is unlikely to yield me best possible results; but the train by this point is passing through Stoke.

Next I need to ignore messages that apply to the train uid’s I’m observing but aren’t relevant to my need — for example: platform confirmations and planned arrival times for stations along the way; service delays and so on. They’re out of scope for my need and other apps do this fine if I’m interested.

Finally I add some logic to deal with the fact that platform announcements can be confirmed or unconfirmed. I’ll receive both; but unconfirmed platform information isn’t supposed to be shown to the public — it’s purpose in this data feed is for internal operational reasons and it often changes so I need to be aware of this. I’ll perhaps do some more work here to understand reliability of this ‘internal’ information for this route specifically but Virgin operate many trains out of Euston and they tend to change things up quite often. Apps such as Live Trains show this data — whether they’re supposed to I’m not clear.

8. Send an alert

Having got some information to send, a few milliseconds after it appears on the feed (I echo out the server’s system time and compare with the message transmission time on the feed), I choose Clickatell’s API to send information to my phone via SMS. It’s a service I’m familiar with, there are others just as easy to integrate to such as Twilio but I can cut and paste the class from a previous piece of work.

Sending my message is therefore as simple as :-

$clickatell = new Clickatell();
$response = $clickatell->send(‘0797xxxxxxx’,$text);

Where $text is a string built from concatenating and formatting the departure time, platform and confirmation status.

SMS may not be the fastest delivery mechanism in some scenarios versus Apple Push Notification Service (APNS); but it’ll save me having to look at my phone screen all the time — one of the requirements.

By the end of my journey the prototype is done and I’ve tested it against a train departing back in London before I pack up.

9. Test

I’ve tested over the last few days.

Aim: Receive platform information for forthcoming departures (Euston to Manchester) faster and more conveniently than:-

Result:

From a convenience point of view the prototype is much better. End to end I’m receiving a text within 2 seconds of it appearing on the real-time feed, most of that being the SMS delivery time. Text messages also show on to my Apple Watch upon receipt to my phone so this wins in terms of creating a more relaxed way of getting the platform information fast.

From a pure speed point of view it’s marginal. I already knew, like all the app users, that the station departure boards were slower by 30 seconds or more when compared to any of the popular apps. Against the three app/online comparisons it seems to be a few seconds or so faster; however this assumes I’m going to continually refresh the app/web page — something I’m trying to avoid.

Consistently I’m several minutes ahead versus the notification feature of the National Rail Enquiries app. If the message is received at all. Could be my phone on O2; or it could be the delays with their third party APNS gateway provider if they’re using one. Other apps such as my banking one send me notification messages fine.

I’m happy it was a a couple of hours well spent. My theory that the big guys have more data to process as their apps are doing more, more alerts to queue/push and more latency in their services overall because of these factors holds true. Across ten sample departures that were tested I received the information sooner or at the same time as the leading apps without having to look at anything until the alert arrived.

I’ll tidy it up one evening soon and add an on/off method for the observer process via a private URL then move the code to an existing AWS instance I’m already spending on. No additional cost other than the few pence for the SMS messages.

To any fellow London to Manchester commuters: I’ll see you in the queue…towards the front without breaking a sweat I hope.