Here’s how data science has changed the way we cover parliamentary votes

Interactive look-up produced for this vote on the 29 March 2019

Since last October, the BBC’s data journalism team has been producing interactive look-ups after important parliamentary votes on Brexit. In this post we’ll explain how we built the data science engine behind these stories, and what we learned about parliamentary procedure along the way.

Why do we need data science to cover votes in Parliament?

It’s useful for journalists and the public to know how MPs vote in Parliament, especially on important issues like Brexit. Until recently the fastest and most reliable source of this data was the Commons Votes app, developed by the Parliamentary Digital Service. Political journalists at the BBC would transfer results from the app into a spreadsheet by hand, which definitely isn’t the best use of their time.

Screenshots taken from the Commons Votes iOS app

That’s why we built a fully automated program which downloads new vote results as soon as they’re released, checks them for errors, saves them in a format that plugs straight into our interactive look-ups, and is also easy for journalists to interpret. Journalists are automatically alerted when new results are available. So far, this has helped us publish 17 ‘How did my MP vote?’ look-ups and charts just minutes after landmark votes on Brexit. These have also been used on the BBC News channel as a live fact-checking tool during interviews with MPs.

One of our look-ups being used live on BBC News during an interview with Mark Field MP

Finding the data

The first challenge in building our engine was finding a fast and reliable data source.

We used a technique called packet sniffing to find out where the Commons Votes app pulls its data from. Packet sniffing is, sadly, a lot less edgy and exciting than it sounds; we ‘listened’ to the traffic going to and from a phone running the app, and noticed it was requesting data from a place called the Commons Votes Services API.

An API is a way for organisations to make their data publicly available in a standardised format, called JSON, that’s easy for developers and data scientists to work with.

The Parliamentary Digital Service were happy for us to use their API, and gave us a lot of useful advice on how it works and how to interpret the data responsibly. Since we started work on this project, they’ve also created a useful website where vote results can be downloaded as spreadsheets.

The Commons Votes API

What does our program do?

Our program takes raw JSON from the Commons Votes API and produces one CSV file for each vote. Extra information for each MP is pulled from the Parliamentary Digital Service Members’ Data Platform, including their constituency code and party.

A clean CSV file produced for this vote on the 25 March 2019

Our code checks the Commons Votes API for new results every 60 seconds, to make sure we never miss a vote and can publish look-ups as soon as possible. The code is deployed on the cloud, so it can run 24/7 without interruptions.

Journalists are automatically alerted on Slack, a messaging platform, as soon as a new set of results is out.

Journalists receive alerts on Slack when new results are out

The data from our pipeline is used by our data journalists along with our R charting package bbplot to produce graphics for our live page on the website, showing who voted along party lines and who rebelled.

A chart from this vote on 9 April 2019

We also built a postcode look-up so that people could find how their MP voted without needing to know the name of their parliamentary constituency, or their MP.

Screenshot from our interactive look-up for this vote on the 29 March 2019

Handling anomalies

Our program detects unusual results or mistakes in the data, and flags them to our journalists through the Slack bot. To build this feature, we needed to learn quite a lot about parliamentary procedure.

MPs vote with their feet and divide into two rooms, or lobbies, depending on whether they’re voting for a motion or against it. In each lobby, votes are counted in two different ways: clerks compile a list of MPs’ names, and tellers take a headcount. The tellers’ counts are read out in Parliament after a vote, and are used as the official result.

Sometimes one or both of these counts are wrong.

Our program checks the data to make sure the tellers’ and clerks’ counts match up for each lobby. When they don’t, journalists are alerted to the discrepancy.

Sometimes MPs vote both for and against a motion.

This might sound strange, but it happens fairly often. An MP might vote ‘aye’ and ‘no’ if they want to show they were present for a vote but didn’t support either side. They might also do this to correct a mistake after accidentally voting in the wrong lobby. In some cases, the result is a data entry error. Either way, the program flags up MPs who appear to have voted both ways so our team can check the results.

The program detects unusual voting behaviours and possible errors

Some MPs don’t vote.

The Speaker and Deputy Speakers never vote unless a tie-breaker is needed. Tellers are MPs too, and don’t vote because they’re involved with the counting process. It wouldn’t be accurate or fair to report these people as ‘absent’ from a vote without explaining why, so contextual information on their positions is added to the spreadsheet and displayed in the final look-up.

The Speaker, Deputy Speakers, and Tellers are flagged in the ‘position’ column for this vote on the 25 March 2019

What have we learned?

In any data science project it’s important to understand the context around your data, and the real life processes it represents. This turned out to be especially important with political data.

Figuring out how to report and visualise vote results when, for example, an MP votes in both lobbies or the tellers’ count is wrong, involved quite a bit of discussion and debate. We worked closely with political journalists and the Parliamentary Digital Service to understand the ins and outs of parliamentary procedure, and make sure our code could handle these unusual situations.

In this vote on the 25 March 2019, we added a footnote to explain that one MP had voted ‘aye’ and ‘no’

We used several tools and technologies for the first time including the Slack API, and command line tools like tmux and vim. These have come in useful for several projects since, including our Twitter bot for EU election coverage and our gender pay gap calculator.

What’s next?

Through this project, we’re building up a detailed picture of every MP’s voting record. This opens up lots of possibilities for in-depth analysis of how MPs vote across different issues, how that’s changed over time, which MPs have similar voting patterns, and more.

--

--