What is Natural Language Processing?

Abhishek Mahli
6 min readJan 25, 2023

--

Computer Science, AI, and Human Language — > The combination of these three makes Natural Language Processing(NLP).

Natural Language Processing

Let's start from scratch.

Question. What is Computer Science?
Answer. Computer Science is the study of computers and computational systems. It generally deals with software and software systems. Fields in computer science include Data Structures & Algorithms, Database Management, Networking, Artificial Intelligence, Machine Learning, Deep Learning, NLP, etc.

Computer Science Fields

Question. What is AI?
Answer. Artificial intelligence (AI) is a branch of computer science that deals with creating intelligent machines that work and learn like humans. AI systems can be trained to perform tasks such as image recognition, speech recognition, natural language processing, decision-making, and more. AI research aims to create systems that can perform tasks that typically require human intelligence, such as problem-solving, learning, and decision-making.

Artificial Intelligence Fields

Question. What is Human Language?
Answer. Language is a tool of communication between individuals. It can be sign Language or speaking Language. Human-speaking language is used to communicate between two Human Beings.
Eg. Hindi, English, French, etc.

Human Languages

Question. What is NLP?
Answer. Natural Language Processing(NLP) combines Computer Science, AI, and Human Languages. Natural Language Processing (NLP) is a subfield of AI that deals with the interaction between computers and human languages, in text or speech format. It involves the use of techniques from computer science, linguistics, and mathematics to enable computers to understand, interpret, and generate human language. NLP tasks include text-to-speech, speech-to-text, sentiment analysis, machine translation, named entity recognition, and more. These techniques can be used to improve natural language interfaces for computers, as well as to help computers understand and analyze large amounts of unstructured text data.

Real Word Applications.

  1. Contextual Advertisement

Example: You will see only those advertisements which are relevant to you. Not like the old Doordarshan ad which was shown to all the audiences.

Contextual Advertisement

2. Removal of Adult Content from Social media

Adult Content Removal

Suppose any children in your family while playing game on your mobile/laptop, get access to those content which are not suitable for them. NLP can play an important role to filter out unwanted content.

Example: Instagram doesn’t allow Nudity.

3. Chatbots

Chatbots

It is not feasible for any company to reply to or resolve the query of every customer. Here NLP plays an important role in making chatbots. Chatbots simply reply to the queries of customers and try to resolve them.

4. Email — Filtering, spam

Email Filtering

A spam filter is a program used to detect unwanted, and virus-infected emails and prevent those messages from getting to a user’s inbox. Like other types of filtering programs, a spam filter looks for specific criteria on which to base its judgments.

5. suggestion while typing on a Mobile Keyboard

Keyboard Suggestions

Word prediction programs prompt the user with a list of likely word choices based on words previously typed. Some word prediction software automatically collects new words as they are used and considers a person’s common vocabulary when predicting words in the future.

Common NLP Task

1. Sentiment Analysis
2. Speech to Text
3. Information Retrieval
4. Spell Checking and Grammer Correction
5. Text/Document Classification

Approach to NLP

  1. Machine Learning Based Approach

A machine learning approach to natural language processing (NLP) involves using algorithms and statistical models to analyze and understand natural language data. This approach is based on the idea that a machine can learn from data, and can be used to perform a wide range of NLP tasks, such as language understanding, text classification, machine translation, and text generation.

There are different types of machine learning models that can be used for NLP, including supervised learning, unsupervised learning, and reinforcement learning. Supervised learning models are trained on labeled data, and are used for tasks such as text classification, sentiment analysis, and named entity recognition. Unsupervised learning models are used to discover hidden structures in the data and are used for tasks such as language modeling, topic modeling, and word embedding. Reinforcement learning is used to train models that can take actions based on the input, such as language generation and dialogue systems.

Machine learning models are trained on large amounts of data, and they are able to generalize to new examples, making it possible to use them in real-world applications.

2. Deep Learning Based Approach

A deep learning approach to natural language processing (NLP) involves using deep neural networks to analyze and understand natural language data. These networks are composed of multiple layers, which allows them to learn complex, non-linear relationships in the data.

Deep learning models for NLP include:

  • Recurrent Neural Networks (RNNs) and their variants such as Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRUs) are commonly used for sequence-to-sequence tasks like language translation, text summarization, and text generation.
  • Convolutional Neural Networks (CNNs) are used for text classification and sentiment analysis
  • Transformer-based models such as BERT, GPT-2, and RoBERTa. These models are used for a wide range of NLP tasks, including language understanding, text classification, and named entity recognition.

Deep learning approaches have shown remarkable results in many NLP tasks, and they are able to achieve state-of-the-art performance on a wide range of benchmarks. They are able to learn useful representations of the text by themselves, which means they require less feature engineering than other models. A large amount of data and computational resources are required to train these models.

3. Heuristic Based Approach

A heuristic approach in natural language processing (NLP) involves using a set of simplified rules or techniques to solve complex NLP problems. These techniques are based on the experience and knowledge of experts in the field and are used to quickly find approximate solutions to NLP tasks, such as language understanding, text classification, and machine translation. Heuristic approaches are often used in cases where a more complex, data-driven approach is not feasible due to limited resources or time constraints.

Challenges to NLP

  1. Sarcasm, Irony
    2. Slang
    3. Contextual Words
    4. Ambiguity
    5. Diversity

In conclusion, natural language processing (NLP) is a rapidly growing field that aims to enable machines to understand and generate human language. There are different approaches to NLP, including rule-based, heuristic, machine-learning, and deep-learning approaches. Each approach has its own advantages and limitations, and the choice of approach depends on the specific task and the available resources.

Rule-based approaches are based on a set of predefined rules and are good for tasks that can be broken down into a set of simple, well-defined steps. Heuristic approaches use a set of simplified rules or techniques to quickly find approximate solutions to NLP tasks. Machine learning approaches use algorithms and statistical models to analyze and understand natural language data. They can be used for a wide range of NLP tasks and can achieve state-of-the-art performance. Deep learning approaches use deep neural networks to analyze and understand natural language data and have shown remarkable results in many NLP tasks.

Overall, NLP is a complex and multi-disciplinary field that requires knowledge of linguistics, computer science, and machine learning. It has many potential applications, from language translation and speech recognition to text generation and language understanding. As the field continues to evolve, we can expect to see even more exciting developments and applications in the future.

--

--

Abhishek Mahli

I write in depth articles on random topics which seem interesting to me.