Understanding Natural Language Processing through Recurrent Neural Networks

Vinay Bhaip
Phone2Action Engineering
4 min readAug 10, 2018

At Phone2Action, we are constantly looking for new and innovative technologies to solve challenging problems in our industry. Machine learning, specifically natural language processing, is one of many tools that we utilize in our platform to create a better user experience for those looking to take action on a particular campaign and effect change within our political system. Understanding the core algorithms that comprise natural language processing is essential in order to take advantage of this growing industry trend.

What is NLP?

NLP, or natural language processing, is the way machines parse and understand the human language (such as English or Spanish). This is used in a diverse range of settings, ranging from sentiment analysis, to machine translation, to chatbots. Natural language processing has been continuously improving, as more novel ideas emerge to make more robust models and leverage new, innovative analysis techniques.

How does NLP Actually Work?

Machine learning in general might seem like magic at first, but at the end of the day, it boils down to a series of mathematical functions that manipulate some input data and map it to a certain output. To understand NLP, we need to understand simple neural networks first. A neural network is a popular type of ML algorithm that consists of multiple layers. Within each layer there are multiple nodes: each node is connected to all the nodes in the subsequent layer, as illustrated in the diagram below. The inputs are passed into the first layer, and each are multiplied by a different weight, represented by the edges, and added to another value, known as a bias, to reach the value of the next node. This process is repeated over and over to reach a final layer, which has our desired output.

Another type of neural net is an RNN (Recurrent Neural Network), a unique type of neural network where the internal layer value is passed on as an additional parameter into the next node. At each time step, the RNN processes both the new input and the core information from the last node. This allows for the network to remember previous information being passed in, allowing for a sequential data flow.

How Should our Input Data Look like?

In order for our model to understand the input it has received, the data have to be in numerical form. Generally, we have a total set of words that we denote as the vocabulary the model has the ability to process. The naive approach that many first take would be to assign each word in our vocabulary a number, from 1 to n, with n being the total number of words we have. However, this creates the problem that two words might seem closer correlated to one another, when they aren’t. For example, if the word “ant” had an assigned value of 1, and the word “apricot” had a value of 2, they would seem closer than “ant” and another word with a larger value, like “zoo”, which of course doesn’t make sense.

To address this problem, we’ll use a technique called one-hot encoding. This means representing each word of our vocabulary as a vector of n length. For the first word in our vocabulary, we represent it as [1, 0, 0 … , 0, 0, 0]. The second word would have a 1 in the next index in the vector. If these vectors were visualized in an n-dimensional space, they would all be equidistant from each other, which allows no words to be “closer” to each other when represented.

Sequence to Sequence Models

Sequence to sequence models are one modern way to turn one sequence into another. This is accomplished by having one RNN, known as an encoder, to take in the input data and output one vector, and passing that vector into another RNN, known as a decoder, which converts that vector into another sequence.

An interesting aspect emerges with encoder-decoder models. The output vector of the encoder should contain all the semantic information of the original sequence. This is because we are translating between two different languages, so the output vector of the encoder has all the information needed to understand the sentence across the two languages. So, for example, when using an encoder decoder model for machine translation between two different languages, like English to Spanish, the output vector from the encoder model should contain the universal meaning of the sequence.

Overall, NLP is a fascinating field that is emerging with applications across various industries, and is a crucial aspect of machine learning to understand. By integrating NLP into our core advocacy suite, such as in chatbots, Phone2Action is able to create powerful tools that revolutionize digital advocacy and citizen-legislator interaction on a global scale.