Efficient Majority Vote Algorithm

I promised cool and fast O(n) algorithms, so here we go. This algorithm, the MJRTY majority vote algorithm, is what first made me want to make this blog about cool O(n) algorithms. So, let's jump into how cool it is..

Let’s set the stage

In this problem, you are given a list of votes from an election, and the goal is to determine if there is a majority, and if there is, what it is.

Sounds simple enough. For simplicity, let's say that the votes are all positive integers that correspond to some candidate in the election. For now, let’s also assume that there is some majority element in the list. Now, the solution just needs to determine the integer that has the majority in the list of numbers.

To be clear, later we will see that this assumption and simplification makes the solution easier to discuss but can easily be “un-assumed” in an actual implementation.

The naive approach

How about the simple search for a maximum element? One approach is simply iterating over every vote and count the number of times it appears elsewhere in the list, keeping track of the maximum as we go.

# votes = list of integer votesmajority = None
majority_count = 0
for vote in votes:
vote_count = 0
for other_vote in votes:
if vote == other_vote:
vote_count += 1
if vote_count > majority_count:
majority = vote
majority_count = vote_count
return majority_vote

Now, I should pause to say that this algorithm most likely has a worse time complexity than the approach that anyone would come up with. That said, it is a simple application of a typical code pattern to solve the problem. At this point, we should probably move on, considering it has O(n^2) time complexity and I’m here to talk about O(n) algorithms.

Hash tables to the rescue

This is a very appropriate sub-heading considering how we are going to take the O(n^2) approach and turn it into an O(n) algorithm. After typing that line, I realized just how frequently I will probably type that line in an O(n) algorithm blog. Hash tables are amazing tools and appear in more O(n) algorithms than I can think of right now.. they probably deserve their own post. Hmm, next time?

Anyways, this is the approach that probably came to mind when the initial problem was stated. A new approach involves maintaining a hash table mapping vote options as keys to a counter that stores the number of times we have seen each option as values. We then just iterate through each vote in the votes list updating our hash table. When we’re done, we find the max element from the hash table.

# votes = list of integer votesvotes_table = {} # New hash tablefor vote in votes:
if vote in votes_table: # Check if key in hash table
votes_table[vote] += 1 # Increment counter
votes_table[vote] = 1 # Create counter for vote
# Find max counter in hash table
return max(votes_table, key=votes_table.get)

Note that we used some useful Python to clean up the second half of our implementation, but we could have used the Python Counter object to clean up the first half too.

This is really great! We were able to take a problem from O(n^2) and show that it can be done in O(n) time. While we were able to reduce the time complexity, we increased the space complexity to O(n) in the worst case. For reference, in our previous implementation we only needed a constant number of variables, but now we store a hash table that in the worst case has as many element as there were votes.

Still, in many cases, we want to optimize for time given that f(x) = x^2 grows a lot faster than f(x) = x. So we are probably willing to use a little more space to solve the problem. But still, can we do better?

The MJRTY majority vote algorithm

This algorithm was proposed by Robert Boyer and J Moore in 1991 and it sometimes called the Boyer-Moore majority vote algorithm. The original paper and acompanying website are linked here: https://www.cs.utexas.edu/~moore/best-ideas/mjrty/.

Screenshot of interactive example of MJRTY on Moore’s website

For a blast from the (very recent) past, checkout the Fortran implementation of the algorithm included in the paper.

If you were clever and manipulated part of the url for the MJRTY algorithm website you may have seen some other cool O(n) algorithms. In time I hope to write about all of these.

Enough delay, let’s get into the algorithm. First, we will need a few variables: a current majority vote and a current majority vote counter. We initialize these to None or null and 0 respectively. To find the majority vote, we iterate through each vote in the votes list. For every vote, we compare it to a current majority. If the counter is 0, then we replace the current majority with the current vote. Then, if the current vote matches the current majority, we increment the counter, otherwise we decrement the counter.

That wasn’t super clear in paragraph form, so here’s some pseudo code:

# votes = list of integer votesmajority = None
majority_count = 0
for vote in votes:
if majority_count == 0:
majority = vote
if vote == majority:
majority_count += 1
majority_count -= 1
return majority

Some may be skeptical that this algorithm actually solves the originally posed problem. The full proof of correctness is well laid out in the original paper linked above, but hopefully I can provide some intuition. I like to visualize this as an m point tug-of-war competition, where there are m different vote options. Then, as each option sees another vote for itself, we move the rope closer to that option. And, when we see a vote for something other than the current majority, we move the rope one step away from the option and closer to the neutral center. Then we see that the order of the votes will not change the final position of the ropes, just the path they take to get there. If my intuition made no sense, as I suspect, then please check the proof in the paper linked above.

Now for the best part, the runtime analysis. Clearly this runs in O(n) time because we only iterate over the list of votes once. But even better this algorithm only uses O(1) space because it only uses a constant number of variables. I hope you are as wowed as I am by this cool algorithm.. Linear time and constant space!

You may have noticed that this algorithm may sometimes claim a majority where there is “close” to a majority. This was something Boyer and Moore acknowledged, and fixed by proposing a second pass over the votes to count the occurrences of the proposed majority and ensure it actually is a majority. This still makes the algorithm run in O(n) time.

Final Thoughts

I hope you think the Boyer-Moore majority vote algorithm is pretty cool considering it was what made me want to make this blog. It is pretty interesting the first time you see it to think that we can reduce the problem to a probable lower bound. Considering how relevant the will be in the future, maybe next time we will talk hash tables, or maybe other O(n) algorithms. Either way, catch you next time.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store