Computronium Blog
Published in

Computronium Blog

Text Segmentation

Normalization, Tokenization, Sentence Segmentation + Useful Methods

Photo by Sergey Zolkin on Unsplash

What does normalizing a text do?

We have previously called this method .lower() to turn all of the words lowercase, so that strings like “the” and “The” both become “the”, so we don’t double count them.

What if we wanna do even more?

--

--

--

Discovering Natural Language Processing and Machine Learning || Every Monday

Recommended from Medium

Spatial Transformer Networks with Tensorflow

Fine-tuning a chat summarizer

Machine learning as a competitive advantage

Generating Synthetic Health Data

Day4: Basic Machine Learning Concepts

Recurrent Neural Networks

Truncated Singular Value Decomposition (SVD) using Amazon Food Reviews

An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale (Brief Review of the…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jake Batsuuri

Jake Batsuuri

I write about software && math. Occasionally I design && code. Find my stuff batsuuri.ca

More from Medium

First steps on Text Augmentation in a non-English dataset

Classifying sentences: part 1 clustering sentences

Tweet Sentiment Extraction

USING AI TO RESTORE “TONE MARKS” ON MY INDIGENOUS LANGUAGE’S ( Yorùbá) TEXTS.

A typical Yoruba drummer