How AI Models Impact NLP: the Past and the Future

A decade ago, no one thought Natural Language Processing would become something like a pre-trained network: a network that makes nondeterministic processing and produces meaningful sentences.

Keymate.AI

Published in

The AI Archives

5 min readFeb 28, 2024

by Ozgur (Oscar) Ozkan, Technical Founder @ Keymate.AI

Natural Language Processing and AI seemed two distinct fields just a decade ago. Today, things have changed and keep changing with the rise of models like BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer). Recognizing patterns and relationships in unstructured text is an AI feature now, and NLP has become a part of this progress.

Approximately 80% of accessible data today is in raw form, often unstructured and originating from various sources like social media, business transactions, and more. NLP techniques are used to exploit this big data to discern patterns and understand information, which has applications in several commercial sectors. This is a result of the change we have witnessed for the past

How the NLP Scene Changed in the Past 10 Years

The term “NLP” has been expanded and transformed in the last 10 years. There are two types of NLP: the first is what computer scientists understand; and the second is what the public might think what NLP is. Popularity of LLM based AI chat and voice has surpassed that of any other Computer Science field.

In 2011, there were only a few people working on NLP and because it was fully text based, it wasn’t exciting, either. No one imagined AI and NLP would be a great couple. AI, back then, was famous for playing Go or Chess, or Dota. NLP was very programmatic and deterministic and structured. Even though text was unstructured all NLP tasks were handling problems as they should be made structured so we can process them. If you wanted to apply NLP to these back then, most people would laugh at you and your professor would probably tell you to work on photo detection or photo generation instead.

Prominence of Transformer Models in 2023

Transformer models like BERT and GPT are getting increasingly successful in NLP. These models are adept at capturing contextual relationships in text and are suitable for various tasks such as machine translation, text generation, language understanding, sentiment analysis, and question answering. LSTM (Long Short-Term Memory) networks are also significant in NLP for capturing long-term dependencies in sequential data .

This progress also gave another direction to other approaches in NLP. Several questions have disrupted the research scene both in business and the Academia:

There is much academic research to answer the question “Is ChatGPT a General-Purpose Natural Language Processing Task Solver?” If the answer is “Yes”, and if GPT-5 is on the horizon, these trigger another valid question: “Is it really worth inventing a better NLP task handler than GPT-4 right now?”

As a response, we are seeing Prompt Engineering on the rise. Prompt based research seems to be here to stay. Results are reliant on prompts which is creating another paradigm where people mostly think that “good prompt is all you need”.

But when it comes to proprietary models, Prompt Engineering might be just overkill for easier NLP tasks to be handled.

NLP and AI in 2024

There are three changing paradigms for NLP in 2024:

Paradigm 1: Pre-Train then Fine-Tune

Paradigm 2: Prompt-based Learning

Paradigm 3: NLP as Text Generation

These paradigms will impact NLP in various ways.

NLP tasks are usually solved really well with a pre-trained very large language model but they often underperform against fine-tuned models for a specific task.

Some direct NLP tasks such as summarization or name entity recognition still are the topics for new research. Because proprietary Large Language Model prompting is too expensive for these repetitive tasks. Businesses are also keen to use open source solutions. Best performing Large Language Model belonging to one company created other issues as well.

Challenges and the Future

One of the important challenges is the ability to evaluate a Large Language Model while it’s performing exceptionally better than others? How will we evaluate the best one on certain tasks, or how can we automate evaluations of the best models. I personally think having another big company to have an equivalent or somewhat close LLM is the main factor in helping researchers cross validate and perform better research on prompting.

Another challenge: People love control, but we don’t have direct control over outputs. Policing outputs on scale is harder as well. Bias in generated text, guardrailing AI to act in several roles are new problems to solve in 2024. Fact-checking and hallucinations are valid problems.

Businesses and decision makers are not ready for unstructured processing. Some companies still have the approach that they need to make it more structured, but in fact you can’t make something structured if you don’t know the underlying structure completely. Interpretability of LLM responses will be another hot topic in the near future. I think solving these challenges will help AI align to human or business needs better.

The fact that pre-trained models are beating non-pre-trained models will possibly lead to a trend where Github (forefront of open source) transforms Microsoft and other big tech; and Hugging Face (forefront of open source for unstructured processing) transforms OpenAI and Microsoft in the long term. They will accept the victory of the pre-trained little LLM due to its better alignment to humans or business when compared to proprietary larger LLM.

There is a famous saying “Software is eating the world”. Now, “AI is eating the software”. NLP is compute. Unstructured compute will eat structured software. All VC money that was poured into startups for programmatic and structured software is at risk now. No one thought that would be the case. Tech has eaten itself. Now all pre-GPT-4 founders are looking for quick solutions to mitigate these risks.

In a year where we are expecting even further developments both in the AI models’ capacity of working with unstructured data and rising interest in AI solutions both in business and academia, many of the questions above will be answered. We will possibly see more diverse models for comparisons, and a shift towards accepting unstructured data as the main playfield.