Oscars 2016: Movies that got the most attention on Wikipedia

Some observations:

  • Getting an Oscar might show as a brief peak of interest in Wikipedia, but it will probably quickly fade away.
  • Unless your get the Oscar for best picture — that one clearly gets the most attention. Or the ones that get lots of awards too.
  • Top current movies (Deadpool) get less pageviews during the Oscar telecast, but enough to be one of the top 15 movies during that period.
  • Want to see the similar results for the 2015 telecast? Scroll to the bottom of this post.

The most interesting part of this visualization is BigQuery’s ability to find all the movies on Wikipedia, and extract pageviews for them.

I’m currently playing with Wikidata (thanks Denny!), and a quick query with BigQuery can give me every human, book, river, etc in Wikipedia (as encoded by Wikidata):

Let’s look at all the movies in Wikidata:

Cool! I know these movies — but this doesn’t tell me which ones were more popular during the Oscars. A quick join with Wikipedia’s pagecounts (more than 5 billion rows, easy for BigQuery) gives me the answer:

Now I have the 15 movies that got the most pageviews during the Oscars ceremony. Last step before charting: Getting the hourly pageviews for these 15 movies during the hours before and after the telecast:

SELECT * FROM (
SELECT a.title title, STRING(datehour) hour, FIRST(label) label, SUM(a.requests) reqs, SUM(reqs) OVER(PARTITION BY label) total
FROM [fh-bigquery:wikipedia.pagecounts_201602] a
JOIN (
SELECT a.title title, FIRST(b.label) label, SUM(requests) requests
FROM [fh-bigquery:wikipedia.pagecounts_201602] a
JOIN (
SELECT en_wiki title, en_label label
FROM [fh-bigquery:public_dump.wikidata_v3]
WHERE instance_of.numeric_id=11424
) b
ON a.title=b.title
WHERE language=’en’
AND DAY(datehour)=29
AND HOUR(datehour) BETWEEN 0 AND 6
GROUP BY 1
ORDER BY 3 DESC
LIMIT 15
) b
ON a.title=b.title
WHERE DAY(datehour)>=28
GROUP BY 1, 2
)
ORDER BY total DESC, hour

The most fun part? I can tweak the query to jump in time to last year’s telecast, or to the Golden Globes day, or to see the most popular song during the Grammys, etc. Have fun!

If this is your first time with BigQuery, get started quickly with these instructions, and for even more subscribe to reddit.com/r/bigquery.

@felipehoffa

Bonus: find the most popular cat: