Understanding NLP Pipeline

An introduction to phases of NLP pipeline

Chaitanya Krishna Kasaraneni
Analytics Vidhya

--

Natural Language Processing (Source: Wootric)

Natural Language Processing (NLP) is one of the fastest growing field in the world. It is a subfield of artificial intelligence dealing with human interactions with computers. Main challenges in NLP involve speech recognition, natural language understanding, and natural language generation. NLP is making its way into a number of products and services that we use everyday. This article gives an overview of common end-to-end NLP pipeline.

The common NLP pipeline consists of three stages:

  • Text Processing
  • Feature Extraction
  • Modeling
Common NLP Pipeline

Each stage transforms text in some way and produces an intermediate result that the next stage needs. For example,

  • Text Processing — take raw input text, clean it, normalize it, and convert it into a form that is suitable for feature extraction.
  • Feature Extraction: Extract and produce feature representations that are appropriate for the type of NLP task you are trying to accomplish and the type of model you are planning to use.

--

--