Explaining… the Single Transferable Vote

I like explaining things so today I’m going to explain the Single Transferable Vote.

The basic idea of the Single Transferable Vote (STV) is that each person should contribute towards the election of one member of parliament. STV is a form of proportional representation as by ensuring that each voter contributes towards the election of someone (and only one person) we know that the seats allocated will be in proportion to the votes cast.

The way it does this is as follows:

Each person votes in a constituency. The constituency will elect more than one person and for that reason most political parties will stand more than one candidate. But as a voter you only get one vote, in other words you only get to contribute towards the election of one candidate. But your vote is transferable, in other words you give an order of preference, and that order of preference is used to move your vote about to where it is needed most.

Here’s how it works in detail:

Firstly, you need to divide your country into constituencies. If you were running an election for the UK Parliament you could simply turn Britain into one big constituency for 650 vacancies, and doing so would ensure the most proportionate outcome. However, it would also lead to an incredibly long and complicated ballot paper and counting process, and there is no real need for this. You can achieve largely proportionate outcomes with much smaller constituencies. Also by making constituencies smaller you can ensure that, as with the First Past the Post system the UK uses at the moment, there is a local link between MPs and the seats they represent.

It’s a trade off, and if you are interested in the detail, I discuss the trade off further in footnote vi. Personally I think 5–8 vacancies per seat (and thus seats 5 to 8 times larger than they are right now) is about the right balance, but the Irish Dáil (parliament) seems to do quite well on 3–5.

Say you’d split up your country into seats with five vacancies. Each political party can stand five candidates in each seat. They can stand fewer if they wish[i] and in addition independents can stand on their own. Voters then rank each candidate in order of preference until they are indifferent[ii]. If you were, for example, a Labour supporter you would probably list the 5 Labour candidates in positions 1–5, but there’s nothing to stop you mixing and matching.[iii]

After the election the first preferences are all counted up. If, at this point or any subsequent point, a candidate has enough votes that their election is guaranteed[iv] then they are declared elected.

We then start moving “wasted votes” around so that they are no longer wasted. There are two kinds of wasted votes: votes for a candidate who already has enough votes (and so your additional vote isn’t adding anything) and votes for a candidate who is trailing in last place.

We start by reallocating all the surplus votes. A candidate that has more votes than they need keeps the amount that they need (this is their quota, discussed in footnote iv), and then the rest are redistributed[v]. So, for example, if a candidate had received 100 first preference votes but only needed 90 to win, then 10 votes would be taken off her pile, and reallocated[vi].

Once redistribution has been done from the top, it is then done from the bottom. This is done in the same way it is done in the Alternative Vote system: the candidate in last place is eliminated and their votes are redistributed.

Redistributed votes go to the next highest preference who is still in the race. In other words: a redistributed vote would then go to that voter’s second preference, unless they had also already been elected[vii] or eliminated, in which case it would go to their third preference, and so on.

This keeps on happening through a series of “rounds”. In each round either spare votes are reallocated from elected candidates that no longer need them, or, if there are no spares left to move around, the last place candidate is eliminated. This goes on until all the vacancies are filled or all the votes are exhausted. Unless someone’s messed up, those two things should happen at exactly the same time.

And that is STV.

[i] Although in game theory terms this is a bad idea as it increases the risk of votes which could otherwise have counted for their party going on to count for another party instead.

[ii] Or at least that is what you do in any sane STV system. Australia had an utterly insane voting system from 1949 to 2013 that didn’t allow you to be indifferent, but did allow you to hand over the job of filling in your ballot paper over to a political party, which (as there were often over 100 candidates per vacancy) around 95% of people generally chose to do. This led to absurd backroom horse trading and game theorising between political parties over minor preferences with predictably stupid results.

The “until you are indifferent” criteria does mean that there are some people in the STV system whose vote does not count towards the final result. If you only give a few preferences and they are all for candidates that are not viable, then there is a chance that your vote will be “exhausted”, in other words none of your candidates will be in the race at the point where you run out of preferences. In those circumstances the result will be the same as if you had never voted. However, this usually only happens to a small percentage of people who only give a small number of preferences and/or exclusively give preferences to fringe candidates. In contrast, the votes of the vast majority of people suffer this fate under the First Past the Post system.

[iii] To my mind one of the strengths of STV is it rewards candidates with cross-party appeal.

[iv] How many votes is that? Well that number is called a quota, and where you set the quota is not so much a question of maths as it is philosophy.

There are two main quotas: Droop and Hare.

The Hare quota is very simple: number of (viable)* votes / number of vacancies. The Hare quota is based on the idea of a mandate. Under the Hare quota if there are four vacancies then the role of each representative is to represent one quarter of the electorate and so they need one quarter of the vote in order to be elected. Under the Hare quota every voter who expresses enough preferences is able to say that one particular representative is “their” representative: that they helped elect them.

The Droop quota is (number of votes / (number of vacancies +1)) +1. The Droop quota is based around the idea that elections are competitions. Once you have reached the Droop quota you are unbeatable: if there are four vacancies then the Droop quota is one vote more than 20%; and if you have reached one vote more than 20% then you are without question one of the top four candidates — it is mathematically impossible for there to be four candidates with more votes than you. And, so the logic goes, if you are in the top four you should win one of the four seats. Under the Droop quota you can’t quite say that every voter who expresses enough preferences contributed towards the election of “their” representative, but they can at least say they were part of the decision, and that their vote at very least helped select between the final two candidates for the final vacancy.

While these differences are philosophical, they do have knock on practical effects. As the Hare quota is bigger, it is harder to reach; this means that usually more lower preferences will need to be counted before a candidate is elected, and that tends to help smaller parties. As Droop is smaller, the opposite is true, and that doesn’t just help larger parties, but also ensures that the counting process is concluded more rapidly. For this reason, rather than any particular reason of philosophy, virtually all STV systems in use today that count by hand tend to use the Droop quota.

There are a couple of other rarely used quotas. The Hagenbach-Bischoff quota is number of votes / (number of vacancies +1). It is always one vote smaller than the Droop quota, so for a four vacancy election it is a flat 20%. It came about because under the Droop quota it is hypothetically possible for a political party to get slightly more than 50% of the votes, but slightly less than 50% of the seats. If you are having an election where the whole parliament is elected using one big constituency, or if this pattern were exactly repeated across multiple constituencies, that could lead to a party being denied a majority in circumstances where majority-obsessed Anglo Saxon psephologists (vote studiers) feel that a majority should rightfully be theirs. And so they invented the Hagenbach-Bischoff quota which will usually prevent this from happening.

However, as the Hagenbach-Bischoff quota is smaller than the Droop quota it is potentially possible for this quota to result in more candidates winning than there are actually vacancies. As discussed, for a four vacancy election the Hagenbach-Bischoff quota is 20%. But if five candidates get 20% of the vote each then all five would “win” under these rules. Therefore, special rules need to be put into place for what to do in these circumstances. Defenders of the Hagenbach-Bischoff quota would say that this only happens in circumstances which are mathematically equivalent to a dead heat, and that you’d need special rules for breaking a dead heat anyway.

Although incredibly rare, it is possible to create an election in which even the Hagenbach-Bischoff quota doesn’t ensure that a party with 50% of the votes gets 50% of the seats. The only way to totally guarantee that is with the Imperiali quota, which is number of votes / (number of vacancies +2). So for a four vacancy election the Imperiali quota is 16.666%. However, as the Imperiali quota is a lot smaller than the Droop quota, it will much more frequently result in too many people getting elected, and so you’d need to use your special rules much more often. Ecuador has a system (not STV) which as part of its rules effectively says “use the Imperiali quota, if that doesn’t work do it again using Droop.”

*viable as in valid and not “exhausted”, in other words the votes still contain preferences for candidates still in the race. For this reason quotas do slightly decrease in size during the course of the counting process. This can mean that later on in the process some extra votes will need to be redistributed according to the rules we go on to discuss.

[v] How do you decide which votes to redistribute? The most simple method is to pick the excess ballots out of the pile randomly. So, for example, if a candidate had received 100 first preference votes but only needed 90 to win, you would pull 10 votes out of her pile at random and redistribute them. This method is known as Hare, Hare-Clarke, or Cincinnati, depending on the method chosen to randomly select the ballots.

For a big enough election random redistribution can be surprisingly fair and proportionate, but in a close election it isn’t sufficiently accurate, and obviously people have an aversion to there being any random element to an election. Furthermore, this randomness means that if you repeat a count you might not get the exact same result, and repeatability is an important quality for helping to rectify mistakes and detect fraud. For this reason psephologists have invented a fairer, but more complicated, system called Gregory’s method.

Under Gregory’s a vote can be split up into fractions of a vote. Then a small proportion of each vote can be reallocated. So, for example, if a candidate had received 100 first preference votes but only needed 90 to win, then each of her votes is split into tenths. Then 9/10ths of each vote is kept with her — adding up to 90 overall — and 1/10th of each vote is reallocated.

You can do a Gregory count by hand, but the larger and more complicated the election gets, the harder the maths gets, and the more tempting it is to use a computer. Most systems for hand counting votes under STV (for example Scottish, or ERS97) use something based on Gregory’s but with some simplifications (usually rounding) and sometimes using some random selection to make short cuts, to strike a balance between simplicity and accuracy.

Then there’s the Wright System which I think is brilliant because it takes its inspiration from musical chairs. Not only does it eliminate randomness but it even discourages (although doesn’t totally stop) “Woodall freeriding” which is something that I talk about later on. Wright counts can also be done by hand, and the maths is easy. The only downside is it takes bloody ages. Under the Wright system every time a candidate is elected you announce that they have been elected, set aside their quota’s worth of votes, and then completely restart the count again, from the very beginning, only this time there is one less vacancy, one fewer candidates and fewer votes to consider.

[vi] What you tend to find is that because many people will cast all their higher preferences for candidates of the same political party, these surpluses tend to initially mostly move votes around between the candidates within a party. This is important as it ensures that the overall number of seats a party gets tends to be in line with their overall level of support. If there are ten vacancies in a seat and Labour candidates get around 30% of the vote, then generally speaking by the end Labour will win three of the ten seats. While individual Labour candidates might have too many votes or too few to begin with, the votes tend to get shuffled around between the Labour candidates until three are chosen.

Exactly how proportionate seat allocations are (in other words the extent to which the number of seats a party receives is in line with the number of first preferences that party’s candidates got) is largely, as I mentioned, down to how big the seats are (although also down to how blindly loyal your supporters are). If you made Britain into one big constituency, then a result under STV would be pretty much just as proportional as a result under any other form of “pure” Proportional Representation, such as d’Hondt. If you go for lots of smaller constituencies then there is a bit of a trade off in proportionality. But, again, Ireland shows you can have really quite proportionate overall outcomes with relatively small (3–5 vacancy) seat sizes. If one party gets slightly lucky in one seat (say winning 2 seats out of 5 with only 30% of the vote) then more-often-than-not the exact opposite will have happened two seats over, and it tends to even out.

[vii] Some particularly bright mathematicians pointed out that this is actually slightly unfair. To explain why we need to use an example. Supposing you have an election where you have four candidates: A, B, C and D. You need 100 votes to win, and 100 people (call them the Beverlys) vote for candidate B first preference and candidate D second preference. Candidate B is elected and those 100 votes all stay put in the pile for candidate B. Now suppose a bunch of people, call them the Lawrences, vote for Candidate A first preference, Candidate B second, and Candidate C third. It’s not enough for Candidate A to win and at some point Candidate A is eliminated. The Lawrences votes would then go to Candidate B, except Candidate B has already been elected, so their votes skip over to Candidate C.

But wait a minute, what about the Beverlys? Both groups of people wanted their vote to be used towards Candidate B, and after that, the Beverlys felt just as strongly about Candidate D as the Lawrences did about Candidate C. Yet just because they came late to the party, the Lawrences get to have their say between C and D whereas the Beverlys’ votes are all locked away under B, their punishment for getting there too early. Further down the line, C could beat D for the final position even if there were more support for D overall, because that support was locked up in the same way the Beverlys’ votes were.

It’s not just that this effect is unfair, it’s also that, by not being fair, it can create perverse incentives to vote strangely. You might, for example, choose to put popular candidates lower down your ballot than your true preference, in the hope that by the time your vote gets to them it would no longer be needed, and so your opinions can continue to have influence beyond that point. This is called “Woodall free riding” and if too many people do it it can really mess up an election, not to mention destroy the whole point of STV, which is that you honestly rank your candidates in order of preference, and are not disincentivised from doing so.

The only way to avoid this is to ensure that when you reallocate preferences you don’t skip over candidates who have already been elected. That will mean that these candidates will then once again have too many votes, and so further votes will then have to be reallocated from their pile.

You can see how this would quickly become very complicated. Supposing you have two candidates: X and Y. Both candidates are popular and have enough votes to be elected. Both candidates are also from the same party, and a lot of the people who gave their first preference to X gave their second preference to Y and vice verca. Now suppose X is allocated some more votes. They now have too many, so some of their votes are reallocated. Most of their second preferences are for Y so these reallocated votes mostly go to Y. Now Y has too many votes and so some of those are reallocated. But most of their second preferences are for X so these votes mostly go back to X. And now X has too many and so some of X’s need to get reallocated, mostly back to Y. And so on. And so on. Forever.

Although this might seem like an insurmountable problem, for a mathematician it isn’t. A mathematician looks at that and thinks “yes well that’s just summing an infinite series of convergent fractions” and this is something that we’ve been able to do since the early 19th century. But you can see how for any but the most trivial of elections this becomes something that it is impractical to do if you are counting the votes by hand.

But it is not impractical for a computer, and indeed there are two methods which, using a computer, are able to deal with the heavy maths involved in these constant micro-readjustments: Meek’s method and Warren’s method.

The two methods are virtually identical, and will produce the same result in almost every case, but if you want to learn the differences, and hear a mathematician argue the merits of each, this paper does exactly that. The maths is pretty complicated, but as best I can work out the gist is that Warren’s starts with the principle that the most important thing is that nobody’s preference should ever — even in the most unusual of circumstances and even by the most minute of amounts — have the effect of damaging a higher preference, and so it is very conservative when it comes to deciding when to reallocate preferences. Meek’s on the other hand starts with the principle that for each reallocation all votes should be treated equally, regardless of the path they took to get to that point, and so it is a little more gung-ho about moving votes about.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.