Entering 2024 with thoughts regarding Large Language Models & their limitations

Published in

SRC Innovations

7 min readJan 15, 2024

We’ve had people suggest several things about Large Language Models (LLMs) that we felt was worthwhile touching on as we begin a new year, especially the comments like these:

“ LLMs are the way forward! “

and

“Who cares anymore about the other text based ML architectures?!”

LLMs are actually really complex

You might have heard similar statements too, implying that LLMs are all that businesses should now care about.

Proponents of those styles of statements tend to give the examples of:

they can write code better than some of my junior developers!
they can summarise big complex documents SO easily!
they have replaced my call centre online chat team and none of my customers have realised!

Whilst LLMs are without a doubt pretty damned cool, and some of the above reasons ARE valid — and at least one feels mildly unethical — there are also multiple reasons why they aren’t the be-all and the end-all in ML systems.

LLMs are big & complex. Whilst it is hard to find verified data about how much has gone into OpenAI’s ChatGPT models, the general estimates and sourced data tends to be:

10–45tb of data
140+ billion parameters
Enough parallel processing for training to be equivalent to 355 years of compute time

This has a whole bunch of implications. It immediately becomes something that only the bigger companies with deep, deep pockets can easily train for their own custom purposes. You COULD take one of the existing LLMs and — with its owner’s permission — additionally train it on your domain specific knowledge, but that’s still not going to be your LLM.

Google’s parent company has also gone on record to say that the costs of using a LLM to replace their current searches could increase costs by 10 times. Given they earn $60b a year, and are worried about the increased expenditure, you can see why it’s something only the biggest enterprises can afford it.

Those 140 billion parameters also require a lot more memory. That training data also requires a lot more storage and incurs more network costs.

As complexity of ML models increases, so do the costs. Quite simply, not everybody can afford building their own, and “renting” one isn’t always appropriate, nor realistic.

LLMs aren’t infallible

The whole hallucination thing has had a lot of coverage already so I’m not going to go into detail about it, but the fundamental thing is that you can’t trust ALL of the information you get from an LLM, therefore, it’s still not a 100% viable replacement for some things. And if you’re spending that much money on building your own… you’d probably want something a lot more infallible. Or if you’re using another company’s LLM, you probably already had some reservations & were (or plan to) carefully examine its limitations — just like you’d do for all SaaS. Right?

Those copyright issues…

As mentioned above, OpenAI’s GPT models were trained on an estimated 45tb of data that OpenAI got from various places. Several major content generation companies, like the BBC, and the New York Times, have alleged that part of the data was from their copyrighted content, and are pushing for an overhaul of the legal landscape related to the use of their copyright content that companies like OpenAI are using to generate revenue.

This makes the use of OpenAI’s stuff in a real production environment a minor item of concern… If the LLM that you are relying on suddenly gets shuttered/suspended/crippled due to legal issues, what is that impact to you?

There are several companies that are attempting to create LLMs that have a clearer chain of ownership with regard to training data, but they have yet to succeed to the level that OpenAI’s GPT models have. It’s also arguable if they can reach the same level of capabilities when they have less data to train on.

That said, OpenAI and other LLM owners are already working on ways to better separate copyrighted content from their training data, so this is definitely a space to watch.

The Alternatives

There are alternatives to LLMs that avoid some of the above problems. One of the ones that we’re using — for various reasons — within SRC Innovations are Recurrent Neural Networks (RNNs) utilising Long-Short-Term Memory (LSTMs) models. These are well known to generate short strings of text based on data, and have also been used in recent years to generate chatbots — albeit with a much more limited data set, and not anywhere near as chatty. But, it become well versed in domain.

We like these because we can train these much faster, and effectively, and with a set of data whose provenance we can be very clear on, and sometimes even on data that we can generate. Much more feasible for a medium sized company like us!

I’ve provided below a table that shows some of the more common text related deep learning models, and also order of magnitude estimates on the amount of data required all the listed text models

And don’t forget… Every increase in data, leads to a corresponding increase in training time!

Or even better: Combine them!

There are also no reasons why LLMs couldn’t be combined in their usage with other ML systems. Or to even mix LLMs with algorithmic systems.

For example, you could have a text based CNN that has been trained on your internal document corpus, and categorises them by information architecture & classifies them based on your security groups. And then let a LLM provide the results to a user, and then accept further input from the user to finalise the discussion!

Or use an RNN to process a series of time-based metrics regarding visitor traffic to your systems, and then an LLM to provide external contextual information regarding events that may be causing the trends that are seen in the visitor traffic.

Regarding how AI might replace human jobs fully…

I’m going to go on the record now and say that this isn’t going to happen in 2024. [Editor’s note: There’s some dissent within SRC Innovations about whether this COULD occur in 2024, but this is JT’s blog post, so I guess I could let him make this statement… 😉] It might many, many years in the future, but that won’t be in 2024.

What will happen, is that people might get replaced by other people that are better at using AI to help them perform their job.

Think of AI as another tool in your toolkit. If an experienced plumber is looking to hire another plumber, they’re gonna pick the one who is more handy with a wrench. Same story. In 2024 — and the years to come — businesses & people are gonna accomplish more if they properly utilise the tool that AI is, and competitors that don’t, are gonna fall by the wayside.

Sidebar: An example of how LLMs are not yet ready to replace people

Here’s an article where products on Amazon are turning up with names that have CLEARLY been generated by Chat-GPT. This is an example of AI being used poorly.

https://arstechnica.com/ai/2024/01/lazy-use-of-ai-leads-to-amazon-products-called-i-cannot-fulfill-that-request/

As I’d mentioned… People can’t be replaced by AI in 2024, but if these same people had used their AI properly, they’d be able to get more done, faster.

In Conclusion

Learn to use AI as a tool, and pick the one that is appropriate for what you’re trying to accomplish.

If you would like SRC Innovations to discuss what’s appropriate for your AI needs, please reach out via our website: https://www.srcinnovations.com.au, we’re very ready to help. 😀

We’re also contemplating a flowchart about how an organisation should approach the use of AI, please let us know if you’d be interested in that as a follow-up post. 😀

A note about the Images

Yes, those images are created by DALL-E, because I have 0 talent for art, and I didn’t want to bug my actual artistic crew (who are busy on something else) for a whole bunch of images. I did also put each of the generated images through Google Images to see if it looked like anything else that already existed. There are definite similarities in style, but nothing that looked exactly the same.

I hope you enjoyed my use of an AI tool to add some colour and whimsy to my post!

More importantly, I hope you found the overall post enlightening and helpful for your AI intents for 2024. Feel free to reach out if you wanted an opinion, guidance or even just with comments!

Happy 2024!

Originally published at https://blog.srcinnovations.com.au on January 15, 2024.