The last time the 112th Congress stood together on anything. | By Clerk of the U.S. House of Representatives [Public domain], via Wikimedia Commons

Congressional antics have kept us pretty entertained for the last few years. The 112th and the 114th Congresses have held the dubious honour of both being labelled “the worst of all time,” mostly as a result of their refusal to actually pass any bills.

Trying to figure out what determines whether or not a bill will pass Congress has been a question posed by several academics over the years. …

I wrote about some of my favourite datasets here. Then I found some more.

  • Here’s a dataset of Nixon’s White House tapes. I’m not saying I’ve already marked off a day in calendar to go through this, but no one try to contact me on February 12th.
  • You can find a bunch of UK-centric policy datasets here. One that looks particularly good is every single question asked at Prime Minister’s Questions between 1997 and 2008. If you’re unfamiliar with the magic that is PMQs’, I recommend you watch this.
  • You can find a tonne of stuff on US legislations (and monitor everything happening in your state) here.
  • Hatebase is the largest online repository of multilingual and usage-based hate speech. CrowdFlower also has a dataset specifically focused on Twitter hate speech.

I’m not going to go into the importance of technology in civic participation. It’s important and it’s useful.

I’m assuming anyone who was skeptical before has clicked those links and is now fully on board and can’t wait to work on a bunch of projects that are becoming more and more essential (by the minute, even) in ensuring an open and transparent government.

Here’s a few good places to start:

Here’s a list of cheatsheets that have become absolutely indispensable over the past few months:

Regular expressions in Python: I’m only just starting to get what regular expressions do, and this cheat sheet has been incredibly useful along the way.

Pandas cheatsheet: I think I may love Pandas slightly more than I do some people in my life. This cheatsheet covers a lot of the more frequently used commands and functions in the Pandas library.

SQL cheatsheet: I don’t use SQL too often so I tend to forget commands pretty quickly. …

Over the past few weeks, I’ve been trying to hoard as many cool datasets as humanly possible (if knowledge is power, filling my hard drive with as many CSV files as possible is going to make me supreme leader of something.)

Here’s a shortlist of my favourites:

(Quick caveat, a lot of these are from one of my absolute favourite newsletters: Jeremy Singer-Vine’s Data is Plural.)

Let me start this off with the obvious: data is one of the most important things you can have.

It’s pretty much essential to making well-informed decisions, it helps give you a better picture of what outcomes your decisions could lead you to, and it’s pretty great to have facts to back you up in an argument you really want to win.

I recently learnt to use (and love) the .describe() method in pandas to get a snapshot of datasets I’m analysing instead of going cross-eyed looking at every row in a CSV file. But while it’s definitely saved me…

Ritika Bhasker

Elections, tech, comics, football. Not necessarily in that order.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store