Natural Language Processing in iOS

Emannuel Carvalho
6 min readJun 5, 2020

--

Photo by Raphael Schaller on Unsplash

Some time ago I presented a talk at CocoaHeads SP on how to use NLP in an iOS app. A lot has changed since then, so I thought it would be nice to post something about it.

Natural Language Processing

The idea of processing human language with computer programs has been around for a while. The tools, methods and approaches change rapidly and there is a myriad of algorithms and techniques some of which people have been using for decades, and others were created just a few years ago.

Some of the common tasks in NLP are tokenization, lemmatization, part-of-speech tagging, word embeddings and text classification. There are, obviously, many other tasks in NLP but it would be impossible to talk about all of them in one post so I decided to talk about those ones.

For each of the tasks I will give a brief explanation, maybe with some use cases for an app, and then I will show how to implement it in iOS.

Tokenization

A text is usually represented in a program as a string. Tokenization handles the question (which might look trivial from a simplistic perspective) of how to split that string into units (paragraphs, sentences, words, etc.).

At first you might be tempted to split the string by breaking it at every period for sentences and every blank space for words, for instance. It could work for a few very short texts, but if you’re handling a larger text chances are that approach would not suffice. Whenever you’re splitting your text, you’d wanna have “Mr. Hegarty” together in the same sentence when tokenizing a sentence like: Mr. Hegarty lives in New York.

Tokenization is the technique used to decompose a text into units (“tokens”) that can be used later in the processing.

In order to tokenize a text in iOS, you'll need to instantiate an NLTokenizer and call the enumerateTokens method.

Example function retrieving tokens from a passed string

Lemmatization

As a former Linguistics Student, I’m tempted to spend longer than I should discussing what lemmatization really is, but I’ll be brief for now — at the risk of disappointing my linguist friends— for the sake of simplicity.

The idea behind lemmatizing a word is turning both lover and loving into the same lemma: love, or turning is and were into be. This can be useful for a number of use cases. Say you want your user to be able to search through a database of descriptions of photos and they want to find pictures with rain, for instance. They might type “raining” at the text field, but I bet you they would like to get results like “… it rained all day…” or “… the rain didn’t stop for a minute…” although none of those sentences present the exact word “raining”.

In order to get the word’s lemma in iOS, you'll need to use the NLTagger class. You should initialize an NLTagger object with .lemma as one of its scheemes, set its string property to be the text you wanna lemmatize and call enumerateTags .

Example function that retrieves the lemmas for a passed string

Part-of-speech tagging

A part-of-speech is the syntactic class to which a word belongs. A word can be a verb, a noun, a preposition and so on.

Determining the POS of a word in a text is far from being a trivial task. In languages like English, where almost every word can get “verbalized”, things could get really complicated.

In order to the POS tag for the tokens in a text we can use a very similar approach to that of lemmatization. The only difference is that instead of using the .lemma scheme, we should use .lexicalClass .

Example function that retrieves the lexical classes (or POS tags) for a given string

Word embeddings

Representing words has always been a challenge. With the advancement of GPUs about a decade ago, and the consequent revival of neural networks, it was necessary to represent words numerically, and it would be even nicer if the word representation could somehow encode the similarities/differences between words.

Word embeddings provide just that. They are a representation of words as n-dimensional vectors (where n usually goes from 50 to 300). That representation is suitable as an input to Machine Learning models, like neural nets. Also, with that representation we end up getting some interesting consequences. For example, we can calculate the Euclidean distance between words (everybody remembers Pythagoras' Theorem, right?), and that distance is related to the semantic similarity between words. So you can imagine the vector for the word "dog" being closer to the vector for the word "cat" then to that of the word "space".

A vector representation can be useful for an iOS app in a number of ways. I would like to mention two use cases for word embeddings in an app: firstly, as an input for a Core ML model; and, secondly, in order to make some features somewhat "smarter".

In order to be used as an input for a Core ML model, you'd have to encode your text as a matrix where each line is a vector for a word in the text. The steps would be basically (1) tokenize your text string — as I mentioned above — ; (2) get the vector for each token; (3) pass on as an input to your Core ML model the list of vectors.

When Apple announced Core ML at the WWDC 2017, I remember one of my unanswered questions in the lab was how to easily use word embeddings in order to preprocess the text for a model. Back then, if you wanted to use word-embeddings, you’d have to do it “manually”, loading the vectors from disk at run time.

A lot has changed since then, and now getting a vector representation for a word is easier than adding a gradient background to a button!

All you need to do is instanciate an NLEmbedding using the wordEmbedding(for:) factory method, then you call vector(for:) passing the word you need as a string.

Getting the word vector

Another interesting use case for word embeddings is allowing your app to be "smarter". Using the same example for an app where the user can search for a picture based on the descriptions. Say your user wants to find a picture with a house on it, then he or she goes ahead and types "house" in the search bar. It's quite possible that they would like to get as a result a picture whose description mentions a "mansion" or a "building". How could you implement that?

One of the cool features of word embeddings is getting the "close neighbors" of a given word. So, if your user searches for a word w you could implement your search bringing the results where such word appears, but also bringing back to the user the results where the closest neighbors of that word appear.

Getting the neighbors for a given word in iOS is a piece of cake:

Gettings the 5 nearest neighbors of a given word

Text classification

Lastly, I'd like to mention text classification. The idea in text classification is basically, given a text, determine whether it belongs to a class (for example, news article, or sports text, or even positive/negative sentiment).

One of the ways to achieve text classification is using Core ML, as I mentioned above. Depending on the model you're gonna use you may need to use word embeddings as a preprocessing step. But what I would like to mention here is one of the most used types of text classification which is sentiment analysis.

The NaturalLanguage framework in iOS has a simple high level API to determine whether a text is positive or negative.

In order to classify a text, you'll use the NLTagger introduced above.

Example function that retrieves the sentiment for a given text

That's not all!

There's still a lot of things you can do in the intersection between iOS and NLP. Obviously I had no pretension to be exhaustive in this post, but you can look up things like, language detection, named entity recognition, document analysis and many other techniques that are easy to use and can have a very positive impact in your apps and, mainly, in the lives of the users.

I hope you liked it.

If you have any comments, questions, suggestions, etc. leave a comment! I'll be glad to answer!

--

--