NLP- Natural Language Processing

Yavuz Alp
3 min readJun 3, 2024

Hello, i am Yavuz Alp Demirci. I am 19 and i am an under-graduate student in METU, majoring in CEIT. I am taking an AI based programming course from the best of all teachers, Zafer ACAR and on my medium.com blogs, i will explain and simplify some of the topics i learnt from this courses’ concept as much as possible.

Todays’ topic : NLP

What is NLP?

  • NLP (Natural Language Processing) is a field of artificial intelligence that focuses on the interaction between computers and human language. It involves the development of algorithms and models that can analyze, understand, and generate human language.
  • NLP involves two main areas: language understanding and language generation;

1- Language understanding encompasses parsing and analyzing text structure, extracting meaning and intent, and recognizing entities, relationships, and events.

2- Language generation includes producing human-like text output, translating between languages, and summarizing or paraphrasing text.

# The techniques and approaches used in NLP span rule-based systems, statistical models, neural networks and deep learning, knowledge-based methods, as well as hybrid approaches that combine multiple techniques.

Popular usage areas instances;

1- Sentiment Analysis:

  • Determining the sentiment (positive, negative, neutral) expressed in text.
  • Analyzing customer feedback, social media posts, or product reviews.

2- Scientific Text Mining:

  • Extracting insights from large volumes of scientific literature and medical records.
  • Accelerating research and discovery in fields like biology, chemistry, and medicine.
  • !!! This way we can interpret any topic we want whether it is academic or not.!!

3- Fraud Detection:

  • Analyzing text data (e.g., emails, financial reports) to detect potential fraud or anomalies.

4- Question Answering:

  • Building systems that can understand and respond to natural language questions.
  • Powering conversational assistants and chatbots.

And more..

Let’s do some coding together in a concise way that anybody can understand!!!

#importing TextBlob package that enables us what we want to do
from textblob import TextBlob

#sentiment analysis
TextBlob('I hate you').sentiment

— > Here you see a very simple sentiment analysis example using the help of the package ‘TextBlob’.

Polarity refers to the overall sentiment or emotional tone expressed in a piece of text. Subjectivity refers to the degree of personal opinion, emotion, or sentiment expressed in the text.

What is Tokenization and what is the importance of it?

  • Tokenization is the process of breaking down a piece of text into smaller, meaningful units called tokens. In the context of Natural Language Processing (NLP), tokenization is a fundamental step in many text processing tasks, as it prepares the text for further analysis and processing.
#First we need to install and import what we need for the topic.
pip install nltk #nltk: Natural Language Toolkit

import nltk
from nltk.tokenize import word_tokenize, sent_tokenize
# Sample text
text = "The quick brown fox jumps over the lazy dog. This is a sample sentence."

# Tokenize the text into words
word_tokens = word_tokenize(text)

print("Word Tokens:", word_tokens)
This is the output of tokenized words

This is a tokenization package that can also distinguish punctuation marks,,

from nltk.tokenize import WordPunctTokenizer
tk=WordPunctTokenizer()

tk.tokenize("Don't hesitate to ask questions")
As we can see it can distinguish punctuations from text

This ones important :) :

  • When starting an NLP project:
    1. translate everything to lower case
    2. remove punctuation and spaces
    3. remove digits
    4. remove line breaks\n
    5. remove stopwords
    6. tokenization
    7. remove suffixes with lemma and stemma and find roots
    8. vectorize

This is the pathway we follow when dealing with a NLP project.

What i like most about NLP?

I think NLP gives you a comprehensive understanding of your data and it ease when dealing with a sentence (string data type) based data frames. I also get entertained a lot when i am doing a sentiment analysis and stuff. NLP is fun and a very broad concept to study on.

Thanks for reading my first blog, i hope you had fun!

--

--

Yavuz Alp
0 Followers

An under graduate who is trying to enhance his understanding by expressing it to others