Deep Politics - First Step Towards an AI Takeover

5 min readNov 5, 2017

Since we’re always striving to improve ourselves and our politicians, we’ve decided to leverage Datahack 2017 - the coolest hackathon in Israel, and build an AI of our prime minister- Benjamin Netanyahu in 42 hours!

TL:DR

Our goal was to create an AI with the ability to

Generate tweets
Generate charismatic speeches
Chat

Which would be easy to relate to Netanyahu, yet won’t repeat raw or seen content.

Results are in our fully functional website deepolitics.com.
(yep. we actually bought a domain for a hackathon. We’re that crazy)

Data Collection

We automatically extracted tweets and quotes, and manually extracted speeches and Q&As from interviews. The Facebook posts we retrieved from the datahack github.

Here is the distribution of sources:

Data Sources Distribution (measured by words)

Our data was minuscule, with a total amount of

So, how did we managed to keep all our promises?

Exploration

A quick look at the data left us with this impression:

Thoughts

Actions — Modeling

Since we were in a hackathon, we had a “Small Data” issue, and time wasn’t on our side either. We decided to go with the top notch approach — keep it simple (and complicate later).

In order to make sure we’re going to finish with a working product, we also started off from simplest component to the toughest component.

Tweet Generation

We first removed irrelevant tweets. A step we were able to take thanks to our “Small Data” situation. We then used Jeremy Singer-Vine’s markovify — a Markov chains implementation — to model Netanyahu’s original tweets.

That alone actually gave us a pretty good baseline, in a very short time. We also expanded the Markov model to obey sentence structure using spaCy, a part-of-speech tagger.

*We’ve tried our luck with a word-RNN but it was lousy, and time consuming.

Speech Generation

Our second tougher challenge was to implement the generation of charismatic speeches.

As before, we started with a simple Markov chains model. Only this time, the problem was harder. Although the sentences were “BBfied”, in order to create an impressive speech we needed to create context, or better yet, limited topics for the speech sentences.

First we had to pre-process the speeches by tokenizing, removing stop words and stemming. With the clean data set, we implemented topic modeling with LDA. We assumed that this attempt would help us create a structure by generating sentences based on certain clusters in a certain order.

It was a nice attempt, but not nearly as creative as our final shot. Since we didn’t possess a lot of data, we harnessed Facebook Research’s trained InferSent in order to apply sentence embeddings to our generated sentences.

Maybe it was the sleepless night, but I was under the assumption that in order to create a fluent speech, the iᵗʰ sentence should be similar to the iᵗʰ-1 sentence semantically. However, they should not be too similar so they wouldn’t be identical in meaning.

Sounds like thresholds on cosine similarities to me :)

In order to avoid repeating identical sentences in the entire speech we measured the iᵗʰ sentence similarity to all of its preceding sentences and kept it below some threshold (0.94 to be exact).

Why? Trial, error and a lot of data gazing ◉_◉

Replying to Responses, or, Chat

The chat was the toughest, and data-wise, our least impressive component.

What we did here was to leverage any bit of data left, by taking the limited amount of Questions from the manually extracted Q&A, appending all the quotes, FB posts and speech sentences. We then used Facebook Research’s trained InferSent to apply sentence embeddings.

Now, after we mapped the data to a space where semantically close sentences are actually closer, we can pre-process a “query” (my 4:00am term for question or input sentence) in the same way, map it to the same space, and get the closest sentence.

If it’s a question we reply with the original answer, if it’s a sentence we return the sentence itself and we’re done. ◕‿◕

Except that we weren’t.

What if the distance between the query and response is too big? Wouldn’t it send our deep_bb enthusiast home with tears?

I believe it would ಠ_ಥ

Fortunately, we implemented the awesome Fall Back scheme.

We harnessed a chatbot from ChatterBOT trained on English greetings, conversations, literature and politics to serve as our 1ˢᵗ fallback.

That fallback covered us with a wider query possibility space. To completely seal it we created a constant pool of general “BBfied” results to use when the chatbot confidence is low.