Twitter Auto-completer for Members of Congress

Aakash Pydi
Jun 9 · 2 min read

This project (Link to Github Repo) is a tweet auto-completer for members of Congress. I used the Twitter API & the DocNow hydrator to create a custom dataset. I then used the GenSim library to generate a custom word2vec representation and finally used a Keras LSTM model to auto-complete tweets.

I used subsets of the following datasets from George Washington University stored on Harvard Dataverse:

Word2Vec Embeddings

Sample output of top ten closest words to the word ‘simple’ in the generated embedded space.

Two dimensional representation of the embedded space (dataset of ~50,000 tweets).

Two dimensional representation of the embedded space (dataset of ~1.5 million tweets)

Written on April 15, 2018

Aakash Pydi

Written by

Code. Debug. Repeat. Currently a Software Engineer on a DevOps+Data Engineering team at Cerner. Always promoting curiosity, camaraderie and compassion.