Graphing Political Cooperation

Does the seemingly impassable political divide impact Senate cooperation?

Published in

Web Mining [IS688, Spring 2021]

10 min readMar 19, 2021

Political discussion is something that seems to rear its ugly head wherever you look, whether it’s Facebook, Reddit, or your dinner table, someone has something to say and they often have an opinion on your opinion. But actual political discussion aside, it is an interesting topic to study from a data perspective. I don’t necessarily mean just the subject matter specifically, but the numbers and data around voting, demographics, and forecasting.

Election Data Guy — Steve Kornacki of MSNBC

I’m sure most of us have watched an election in the past, and if you have than you probably notice all the sophisticated visuals and predictions they make on the television. Often times, they know who’s going to win a race or what bill is going to pass before voting is complete.

Since there is no election approaching, I figured it would be interesting to check in on our current government, specifically are new 50/50 senate. One of the discussions stemming from the last election was of repairing the divisions created in recent years. So I figured we could take a look at what senators are cooperating with one another, and what the Senate landscape looks like in terms of agreement.

My first thought was to check how each Senator votes compared to every other senator (Do they vote for the same bill? Or do they just vote party line?), unfortunately, rate limiting on the API I chose has prevented this. So I thought about bill sponsors and their cosponsors. Can we see if there is cooperation with bill sponsorship? What Senators tend to be more “centrist” in this regard? This was something I feel I could accomplish within the constraints.

For this particular project, we want to visualize this with a network graph visual. This will hopefully visualize for us just how partisan the senate is when it comes to bringing forth like-minded bills.

Technology Used

I decided I would tackle this exclusively with Python. While it might have been a bit easier to use Tableau or another GUI based application, I wanted the challenge and chance to sharpen my skills. So like with my last post, I turned to Google Colab (sort of like Jupyter). Additionally, I need a few Python libraries to assist, so I went with NetworkX and Matplotlib.

For my source of data, I went with the ProPublica Congress API. All I had to do was sign up and they sent me an API key for use. Since we are focusing on the Senate, I populated our URL variable with the base URL I’ll need to gather the names of Senators.

fileObject = open(“/content/drive/MyDrive/ProPublicaAPIKey.txt”, “r”)API_KEY = fileObject.read()URL = ‘https://api.propublica.org/congress/v1/117/senate'

Note in the URL above, we use “117” for the 117th congress (our current congress).

Gathering the Data

The first step was to gather the data. To kick it off I needed a list of our current senators.

response = requests.get(URL + “/members.json”, headers = {“X-API-Key”: API_KEY})r = response.json()df = pd.DataFrame(r['results'][0]['members'])

This returned a considerable amount of Senator data. After a few more steps of trimming down the list I ended up with a DataFrame that I could begin to work with. This step actually took some trial and error as I wasn’t aware of exactly what data I would need in the end.

What I figured out as I progressed through the activity was that the ID’s are good for pulling data from the API, but for the visual, I want the full name of the Senator. Additionally, with the network graph, there was no good way to easily tell which party belonged to which grouping.

First thing was first, I started by saving all the ID’s to a list, this way I can leverage the ID within the necessary endpoint for gathering the cosponsored bills.

Beyond that I want to associate the Senator with their Party for future visualization. So I started by creating a DataFrame and concatenating the First and Last name fields, trimming off the ID and the individual First_Name and Last_Name fields, and saving this information to a dictionary {‘Full_Name’: [‘Party’]}.

id_list = df[‘id’].tolist() #save the ID's to a list for iterationdf_party = df.fillna(‘’) #Get rid of None Typesname_cols = [‘first_name’, ‘middle_name’, ‘last_name’]df_party[‘Full_Name’] = df_party[name_cols].agg(‘ ‘.join, axis=1).str.replace(‘\s+’, ‘ ‘) #Create a properly formatted full name

Stack Overflow source for code help.

Once this was complete, I was able to get a dictionary of all senators and their party affiliations.

df_party = df_party.set_index(‘Full_Name’).Tparty_dict = df_party.to_dict(“list”)

Senate members and their party

At this point, I have to iterate through my list of ID’s in order to pull all of the cosponsored bills. In order to do this, I create an empty dictionary and iterate through the previously saved ‘id_list’. The for loop iterates through each ID and places it within the endpoint URL. From there, we have to work with the JSON structure given, so I take the response, the first (and only) element in the ‘results’ list, followed by the list of bills. This list of bills is actually a list of dictionaries, one for each bill cosponsored. This provides a lot of information, but I only need the ID’s for this exercise. So I now iterate with another for loop through each bill and pull out each ID, appending it to it’s own list that is instantiated within the first for loop. At the end I take the current working ID, associate it with the Full_Name of the Senator to make the key and the list of cosponsored bills will become the values.

This is easier to read than the Medium code snippet.

Sample output of cosponsorDict put into a DataFrame

Building the Graph

Here I take the key’s of the dictionary I created above, and add them as nodes to the graph.

g = nx.Graph()g.add_nodes_from(cosponsorDict.keys())g.nodes()

For each item in the dictionary I want to bull out the bills that are associated to each node. (Note: This is where I ran into some trouble. I’ll describe at the end of this post).

for key, value in cosponsorDict.items():g.add_edges_from(([(key, t) for t in value]))

Next I set up our color map, this is where the party_dict created above comes in handy. I added conditionals for whether the Senator is a Republican, Democrat, or Independent and associate each to a color. The cosponsored bills will be gray. Additionally, since I’m more interested in the commonalities between senators, I made their nodes larger than the bills themselves.

Next I create the network graph. First I create a dictionary for the labels and below I draw the graph and then the labels before showing the plot.

Drawing the Graph

As you can see above, it appears they aren’t that far off on cosponsorship but likely because of the two obvious exceptions, Kelly Loeffler (who is no longer a U.S. Senator, but because of the runoff, participated in the latest congress) and Vice President Kamala Harris who typically comes in for tie breaking votes only. So given the scope, this isn’t shocking. But in order to get a better picture of the sitting Senators, I had to go back and remove Kamala Harris and Kelly Loeffler from the data and subsequently, the graph.

After removing Kelly Loeffler and Kamala Harris

Without the labels you can see that there is absolutely partisanship amongst the Senators, which is not unexpected, but some people tend to be farther from the middle than the rest. In order to figure out who these people are, let’s add the labels. But before I do that, I’ll mention that given the clustering of nodes going on, adding the Senator names will make it almost impossible read with the exception of some of the outer nodes.

The last picture in the previous section included some coding for labels based on centrality. Adjusting that code a little bit is sufficient in getting labels based on whatever you want to see. If this first case, I adjusted the labels for those with high betweenness centrality. This will tell me the outlier in the cluster, those who do not belong to a community, and therefore less likely to cosponsor a bill with the other party. Since there is a lot of label overlap, here is the printed list of labels.

{‘Roy Blunt’: ‘Roy Blunt’, ‘Richard M. Burr’: ‘Richard M. Burr’, ‘Benjamin L. Cardin’: ‘Benjamin L. Cardin’, ‘Susan Collins’: ‘Susan Collins’, ‘Steve Daines’: ‘Steve Daines’, ‘Deb Fischer’: ‘Deb Fischer’, ‘Lindsey Graham’: ‘Lindsey Graham’, ‘Charles E. Grassley’: ‘Charles E. Grassley’, ‘Ron Johnson’: ‘Ron Johnson’, ‘Mark Kelly’: ‘Mark Kelly’, ‘John Kennedy’: ‘John Kennedy’, ‘Ben Ray Luján’: ‘Ben Ray Luján’, ‘Mitch McConnell’: ‘Mitch McConnell’, ‘Jon Ossoff’: ‘Jon Ossoff’, ‘Rand Paul’: ‘Rand Paul’, ‘Mitt Romney’: ‘Mitt Romney’, ‘Mike Rounds’: ‘Mike Rounds’, ‘Ben Sasse’: ‘Ben Sasse’, ‘Brian Schatz’: ‘Brian Schatz’, ‘Charles E. Schumer’: ‘Charles E. Schumer’, ‘Richard C. Shelby’: ‘Richard C. Shelby’, ‘Dan Sullivan’: ‘Dan Sullivan’, ‘John Thune’: ‘John Thune’, ‘Tommy Tuberville’: ‘Tommy Tuberville’, ‘Raphael Warnock’: ‘Raphael Warnock’

If you look close, you’ll see a few also do not typically cosponsor bills even with their own party. In some cases these are freshman senators (John Ossoff), but in many they are long time Senators (Lindsey Graham).

You’ll also notice there are some who are very far into their own respective party, most notable, Rand Paul who’s node is on the far right of the Republican side of the graph.

But curiously, there is a good mix of individuals right in the thick of things. For this, I will use closeness centrality as a means to label those who have somewhat of a more cooperative relationship with their colleagues.

This shows pretty well (but not terribly visible) that there appears to be a few Senators who would be considered close to all the other nodes on average. Since this is difficult to see, I printed the list of who is shown.

‘Thomas R. Carper’: ‘Thomas R. Carper’, ‘John Cornyn’: ‘John Cornyn’, ‘Steve Daines’: ‘Steve Daines’, ‘Joni Ernst’: ‘Joni Ernst’, ‘Charles E. Grassley’: ‘Charles E. Grassley’, ‘Ben Ray Luján’: ‘Ben Ray Luján’, ‘Jerry Moran’: ‘Jerry Moran’, ‘Gary Peters’: ‘Gary Peters’, ‘Marco Rubio’: ‘Marco Rubio’}

Issues

* Nodes showed up for bills. I had some problems trying to get weighted edges between senators with common bills, so I just turned the bill nodes gray and shrank them to make them less obvious. I spent a lot of time trying to figure out a means to take dictionary key’s and create set’s based on the common bills between each senator, but to no avail so I had to improvise. This contributed to a lot of the clutter in the graph.

* Rate limiting of 5,000 calls per day is in place for the API. While there are ways around this (storing the data over a few days), I decided I would rather stream right from the API. Thus, I went with the cosponsor comparison versus the actual vote comparison.

* At a few points I had to go back and refactor my code. The description above explains at what points I had to do that (for instance, create a full name and party dictionary). This came in use as I made the graph a bit easier to read.

* Overlapping nodes are an eyesore. As are overlapping labels.

* Mismatch between some senators having a middle name or initial and some senators not. I had to refactor part of the code to not ignore the middle initial so that I can get the graph to properly generate the color and the label.

Limitations:

* I suspect that a lot of cosponsored bills, while indicative of cooperation, are not necessarily and indication of overall bipartisanship. When you see a network graph of votes, I imagine you still see a larger divide. However, this does appear to be a good representation of each Senator and how willing they are to cooperate with their peers.

* This does not show if there is an improvement in divide or a deeper divide over time. This process would need to be recreated for other congresses, however, this code is fairly reusable and it’s just a matter of adjusting the API endpoints.

Conclusion

The 117th congress appears to be divided pretty clearly amongst party lines. While there are a few Senators who appear to work more closely with their colleagues across the aisle, the graph speaks for itself showing a clear divide between the two parties and clearly showing those who might be considered more “partisan” than others.

References

ProPublica Congress API

Using the Congress API, you can retrieve legislative data from the House of Representatives, the Senate and the Library…

projects.propublica.org

NetworkX - NetworkX documentation

NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of…

networkx.org

Concatenating multiple DataFrame columns and removing multiple spaces

I have a pandas DataFrame (20 x 1e6) with several name fields ['PREFIX', 'FIRST_NAME', 'MIDDLE_NAME', 'LAST_NAME'…

stackoverflow.com

Stack Overflow - Where Developers Learn, Share, & Build Careers

We build products that empower developers and connect them to solutions that enable productivity, growth, and…

www.stackoverflow.com

More Information

robertrose85/WebMining

Contribute to robertrose85/WebMining development by creating an account on GitHub.

github.com