Q#100: Allocating votes from pooled elections

Student Elections

Note: I believe it should be free and readily available for everyone to gain value from data, hence I pledge to keep this series free regardless of how large it grows.

Suppose you’re running a school election, and are looking to tally up the results. The election is structured as follows:

  • Groups of students are split into voting pools, each with unique IDs. The voting pools are assigned a number of votes to allocate based on how many students are in the given pool
  • A voting pool can vote multiple times, but if they do their votes will be split evenly amongst the candidates

The voting information is stored in two tables (shown below): ‘election_data’ and ‘voter_pools’:

  • election_data contains each voting pool’s votes for the given candidates. You’ll notice many pools voted for multiple candidates
  • voter_pools contains the number of votes allocated to each unique pool

Given this information, write a SQL query to return the winner of the election, along with their associated votes.

Table: election_data

Table: voter_pools

TRY IT YOURSELF

ANSWER

Running a school election involves processing voting data efficiently to determine the winner. And as Data Scientist we are called to pull data and summarize it using SQL.

SQL Query:

WITH CandidateVotes AS (
SELECT
ed.candidate,
vp.voting_pool_id,
vp.num_votes / COUNT(DISTINCT ed.candidate) AS votes_per_candidate
FROM
election_data ed
JOIN voter_pools vp ON ed.voting_pool_id = vp.voting_pool_id
GROUP BY
ed.candidate, vp.voting_pool_id, vp.num_votes
)

SELECT
cv.voting_pool_id,
cv.candidate,
SUM(cv.votes_per_candidate) AS total_votes
FROM
CandidateVotes cv
GROUP BY
cv.voting_pool_id, cv.candidate
ORDER BY
cv.voting_pool_id, total_votes DESC
LIMIT 1;

Explanation:

  1. We create a Common Table Expression (CTE) named 'CandidateVotes' to calculate the number of votes per candidate per voting pool. The votes are evenly distributed among the candidates in each pool.
  2. The main query selects the voting pool, candidate, and the total votes for each candidate by summing up the votes_per_candidate within each pool.
  3. The results are grouped by the voting pool and candidate, and the rows are ordered by the voting pool ID and total votes in descending order.
  4. Finally, we use the 'LIMIT 1' clause to retrieve only the top row, which represents the winner with the highest total votes.

Plug: Checkout all my digital products on Gumroad here. Please purchase ONLY if you have the means to do so. Use code: MEDSUB to get a 10% discount!

Tips and Donations

--

--