Natural Language Processing vs. Natural Language Generation
By Nate Nichols, Product Architect at Narrative Science
The field of Artificial Intelligence (AI) is equal parts exciting and bewildering right now. Major advances are being made in a variety of areas, but following along is difficult because there are so many technical terms and acronyms. And don’t even get me started on how many of the terms are similar. For instance, there’s Deep Blue, Deep Learning, Deep Forest, Deep Voice, and DeepStack. Anyone would be lost.
Given the nature of our business, we often encounter confusion between Natural Language Processing (NLP), Natural Language Generation (NLG), and Natural Language Understanding (NLU).
Let’s Start with NLP and NLG
Setting aside NLU for the moment, we can draw a really simple distinction:
- Natural Language Processing (NLP) is what happens when computers read language. NLP processes turn text into structured data.
- Natural Language Generation (NLG) is what happens when computers write language. NLG processes turn structured data into text.
Until the last few years, NLP has been the more dynamic research area; the focus was on getting more data into the computer (e.g. teaching the machine how to “read” an email and determine if it’s likely to be spam).
The problem has now flipped. Our computers have access to vast repositories of data, and the problem is trying to get actual value and insights back out from all that data. (This, of course, is the exact business problem that Quill, our Advanced NLG platform, helps solve.)
This distinction doesn’t mean that NLP and NLG are completely unrelated. Reading and writing are separate but related challenges for computers, just like for humans. For instance, we’ve had projects in the past that used NLP to generate structured data from text (e.g. assigning a topic to a tweet), and then used NLG to write text from that structured data (e.g. “You tend to tweet about politics…”) We also use a variety of NLP techniques internally to help test and tune our NLG engine.
It’s worth mentioning here that the private sector and academia have slightly different definitions of NLP. To most folks, NLP is “Computers reading language.” But in academia, the “Processing” part of NLP is taken more seriously and NLP basically means “Computers doing things with language.” In academia, then, NLG is a subfield of NLP, not its inverse.
At Narrative Science, our view is that NLG is a separate category of its own within the AI ecosystem.
What about NLU?
I mentioned NLU earlier; NLU stands for Natural Language Understanding, and is a specific type of NLP. The “reading” aspect of NLP is broad and encompasses a variety of applications, including things like:
- Simple profanity filters (e.g. does this forum post contain any profanity?)
- Sentiment detection (e.g. is this a positive or negative review?)
- Topic classification (e.g. what is this tweet or email about?)
- Entity detection (e.g. what locations are referenced in this text message?)
A more advanced application of NLP is NLU, ie. genuinely understanding what the text says. NLU is used by conversational agents including Alexa, Siri and Google Assistant.
Each of these agents is able to digest spoken text like, “What’s the weather forecast tomorrow?” and then understand it as a request for the forecasted weather in the current location one day hence. (Of course, if you’ve spent much time with these types of bots, you’ll understand that there is still a significant amount of progress to make in Natural Language Understanding.)
Like so many things in technology, NLP, NLG, and NLU are pretty straightforward concepts dressed up in jargon and acronyms that make them seem more complex than they really are. To reiterate:
- NLP is computers reading language
- NLG is computers writing language
- NLU is computers understanding language
I hope this helps clarify the differences between NLP, NLG, and NLU! Our goal is to educate AI newcomers on the terms as we believe that widespread adoption is best enabled by widespread understanding.