10 Applications of Artificial Neural Networks in Natural Language Processing

Data Monsters
Aug 17, 2017 · 8 min read

by Olga Davydova

Since artificial neural networks allow modeling of nonlinear processes, they have turned into a very popular and useful tool for solving many problems such as classification, clustering, regression, pattern recognition, dimension reduction, structured prediction, machine translation, anomaly detection, decision making, visualization, computer vision, and others. This wide range of abilities makes it possible to use artificial neural networks in many areas. In this article, we discuss applications of artificial neural networks in Natural Language Processing tasks (NLP).

NLP includes a wide set of syntax, semantics, discourse, and speech tasks. We will describe prime tasks in which neural networks demonstrated state-of-the-art performance.

1. Text Classification and Categorization

Text classification is an essential part in many applications, such as web searching, information filtering, language identification, readability assessment, and sentiment analysis. Neural networks are actively used for these tasks.

In , a series of experiments with (CNN) built on top of was presented. The suggested model was tested against several benchmarks. In and , the task was to detect positive/negative sentiment. In , there were already more classes to predict: very positive, positive, neutral, negative, very negative. In , sentences were classified into two types, subjective or objective. In the goal was to classify a question into six question types (whether the question is about person, location, numeric information, etc.) The results of numerous tests described in the paper show that after little tuning of hyperparameters the model performs excellent suggesting that the pre-trained vectors are universal feature extractors and can be utilized for various classification tasks .

The article shows that it’s possible to apply deep learning to text understanding from character-level inputs all the way up to abstract text concepts with help of temporal (ConvNets) (CNN). Here, the authors assert that ConvNets can achieve excellent performance without the knowledge of words, phrases, sentences and any other syntactic or semantic structures with regards to a human language . To prove their assertion several experiments were conducted. The model was tested on the with 14 classes (company, educational institution, artist, athlete, office holder, mean of transportation, building, natural place, village, animal, plant, album, film, written work). The results indicate both good training (99.96%) and testing (98.40 %) accuracy, with some improvement from thesaurus augmentation. In addition, the test was performed on the . In this study, the researchers constructed a sentiment polarity data set with two negative and two positive labels. The result is 97.57% training accuracy and 95.07% testing accuracy. The model was also tested on with 10 classes (Society & Culture, Science & Mathematics, Health, Education & Reference, Computers & Internet, Sports, Business & Finance, Entertainment & Music, Family & Relationships, Politics & Government) and on where the task was a news categorization into four categories (World, Sports, Business, Sci/Tech.). Obtained results confirm that to achieve good text understanding ConvNets require a large corpus in order to learn from scratch.

Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao introduced networks for text classification without human-designed features in their document . The team tested their model using four data sets: (with four categories such as computers, politics, recreation, and religion), Fudan Set (a Chinese document classification set that consists of 20 classes, including art, education, and energy), (with five languages: English, Japanese, German, Chinese, and French), and (with Very Negative, Negative, Neutral, Positive, and Very Positive labels). After testing, the model was compared to existing text classification methods like , , , , , , and . It turned out that neural network approaches outperform traditional methods for all four data sets, and the proposed model outperforms CNN and RecursiveNN.

2. Named Entity Recognition (NER)

The main task of named entity recognition () is to classify named entities, such as , Microsoft, London, etc., into predefined categories like persons, organizations, locations, time, dates, and so on. Many NER systems were already created, and the best of them use neural networks.

In the paper, , two models for NER were proposed. The models require character-based word representations learned from the supervised corpus and unsupervised word representations learned from unannotated corpora . Numerous tests were carried on using different data sets like and in English, Dutch, German, and Spanish languages. The team concluded that without a requirement of any language-specific knowledge or resources, such as gazetteers, their models show state-of-the-art performance in NER.

3. Part-of-Speech Tagging

Part-of-speech ( has many applications including parsing, text-to-speech conversion, information extraction, and so on. In the work, a with word embedding for part-of-speech (POS) tagging task is presented . The model was tested on the data set and achieved a performance of 97.40% tagging accuracy.

4. Semantic Parsing and Question Answering

systems automatically answer different types of questions asked in natural languages including definition questions, biographical questions, multilingual questions, and so on. Neural networks usage makes it possible to develop high performing question answering systems.

In Wen-tau Yih, Ming-Wei Chang, Xiaodong He, and Jianfeng Gao described the developed semantic parsing framework for question answering using a knowledge base. Authors say their method uses the knowledge base at an early stage to prune the search space and thus simplifies the semantic matching problem. It also applies an advanced entity linking system and a deep model that matches questions and predicate sequences. The model was tested on data set, and it outperforms previous methods substantially.

5. Paraphrase Detection

Paraphrase detection determines whether two sentences have the same meaning. This task is especially important for question answering systems since there are many ways to ask the same question.

suggests a method for identifying semantically equivalent questions based on a . The experiments are performed using the and data. It was shown that the proposed model achieves high accuracy especially when the words embedded are pre-trained on in-domain data. The authors compared their model’s performance with and a . They demonstrated that their model outperforms the baselines by a large margin .

In the study, , a novel recursive architecture is presented. It learns phrasal representations using . These representations are vectors in an n-dimensional semantic space where phrases with similar meanings are close to each other . For evaluating the system, the and were used. The model was compared to three baselines, and it outperforms them all.

6. Language Generation and Multi-document Summarization

Natural language generation has many applications such as automated writing of reports, generating texts based on analysis of retail sales data, summarizing electronic medical records, producing textual weather forecasts from weather data, and even producing jokes.

In a recent paper, , researchers describe a (RNN) model capable of generating novel sentences and document summaries. The paper described and evaluated a database of 820,000 consumer reviews in the Russian language. The design of the network permits users control of the meaning of generated sentences. By choosing sentence-level features vector, it is possible to instruct the network; for example, “Say something good about a screen and sound quality in about ten words” . The ability of language generation allows production of abstractive summaries of multiple user reviews that often have reasonable quality. Usually, the summary report makes it possible for users to quickly obtain the information contained in a large cluster of documents.

7. Machine Translation

Machine translation software is used around the world despite its limitations. In some domains, the quality of translation is not good. To improve the results researchers try different techniques and models, including the neural network approach. The purpose of study is to inspect the effects of different training methods on a Polish-English machine translation system used for medical data. To train neural and statistical network-based translation systems was used. It was demonstrated that a neural network requires fewer resources for training and maintenance. In addition, a neural network often substituted words with other words occurring in a similar context .

8. Speech Recognition

Speech recognition has many applications, such as home automation, mobile telephony, virtual assistance, hands-free computing, video games, and so on. Neutral networks are widely used in this area.

In , scientists explain how to apply to speech recognition in a novel way, such that the ’s structure directly accommodates some types of speech variability like varying speaking rate . and a large-vocabulary voice search tasks were used.

9. Character Recognition

Character Recognition systems also have numerous applications like receipt character recognition, invoice character recognition, check character recognition, legal billing document character recognition, and so on. The article presents a method for the recognition of handwritten characters with 85% accuracy .

10. Spell Checking

Most text editors let users check if their text contains spelling mistakes. Neural networks are now incorporated into many spell-checking tools.

In a new system for detecting misspelled words was proposed. This system is trained on observations of the specific corrections that a typist makes . It outwits many of the shortcomings that traditional spell-checking methods have.


In this article, we described Natural Language Processing problems that can be solved using neural networks. As we showed, neural networks have many applications such as text classification, information extraction, semantic parsing, question answering, paraphrase detection, language generation, multi-document summarization, machine translation, and speech and character recognition. In many cases, neural networks methods outperform other methods.