Unlucky for some: 13 GenAI šŸ”® predictions for 2024

Duncan Anderson
Barnacle Labs
Published in
12 min readDec 31, 2023

As we usher in 2024, Iā€™m excited to share my Generative AI predictions for the year ahead. Thereā€™s 13 of them, so maybe I will be unlucky in my prediction hit-rateā€¦ although we wonā€™t know for another 12 months yet. Maybe this post will discreetly disappear if I turn out to be too wrong, so make sure to read it whilst you can!

A lot transpired in 2023 and tools like ChatGPT have become integral to how many of us now perform work. ChatGPT now has 100 million weekly active users. Not monthly, which is the standard metric, but weekly. It achieved 100m monthly active users in its first two months, the fastest ever take up of a consumer internet service. This rapid adoption signals huge interest in AI and this interest fuels investments, further accelerating the pace of innovation.

Currently, weā€™re transitioning from the ā€œoh look, this is interestingā€ phase to the ā€œletā€™s make this earn its keepā€ stage. I run a GenAI consultancy and every conversation I have with clients focusses on how to extract value, so itā€™s no surprise that I expect to see Generative AI in 2024 doing more serious work. No longer just a novelty, GenAI is now playing in the big leagues.

Without any further ado, here are my predictionsā€¦

šŸ† OpenAI retains its leadership position

I expect OpenAI to continue to lead in terms of model performance. GPT-4 is the current king, but OpenAI isnā€™t standing still. Googleā€™s Gemini Ultra may snap at GPT-4ā€™s heels, but I expect GPT-4.5 to emerge and put clear blue water between them.

Itā€™s common for model providers to quote benchmarks suggesting their model is approaching OpenAI capabilities. Journalists then pick up these benchmarks and hint at ā€œGPT-4 killerā€ abilities.

Except thatā€™s not the real world.

Benchmarks routinely lie and providers routinely selectively choose which benchmarks to quote.

In the real world, GPT-4 is unrivalled in terms of its ability to reason and adhere to prompt instructions ā€” all other providers still have a long way to go. I donā€™t see OpenAIā€™s leadership changing in 2024.

This is despite the board-room chaos that emerged this year. That certainly didnā€™t help OpenAIā€™s cause, thatā€™s for sure.

šŸ’° The rise of paid-for and synthetic training data

I expect deals by model providers to buy access to proprietary training data to proliferate. The NY Times is suing OpenAI for using its data without paying. But before AI detractors celebrate in the streets, consider this: itā€™s likely the NY Times just wants to monetise its data. It doesnā€™t want OpenAI to stop, it wants to get paid.

Itā€™s likely the law suite will be settled, because thatā€™s in the interests of both NY Times and OpenAI. OpenAI have already struck a deal with Associated Press, so thereā€™s clear precedent.

Other providers with deep pockets will also sign content deals. Itā€™s rumoured that Apple is out offering deals to buy access to good training data for its Apple GPT initiative. Targets include CondĆ© Nast, publisher of Vogue and the New Yorker; NBC News; and IAC. It shouldnā€™t be ignored that such a deal between NY Times and Apple might have an impact on the NY Times / OpenAI case.

If model providers end up paying content providers, that leaves some big questions. Like where does this leave open source providers without those deep pockets? Maybe this is how rich organisations buy their moat.

Thereā€™s also a strong likelihood of a dispute arising around open source models using synthetic training data generated by commercial models like OpenAIā€™s GPT-4. Sites like ShareGPT are collecting GPT conversations and this has become a rich source of training data for a rash of open source models in 2023.

But using the output of OpenAI models to train a competitive model is against OpenAI terms:

ā€œWhat You Cannot Do. You may not use our Services for any illegal, harmful, or abusive activity. For example, you may not: Use Output to develop models that compete with OpenAI.ā€

If OpenAI is forced to pay to access training data, then expect them to come down on competitive models trained on their output.

šŸ«Ø Open Source innovation continues to surprise

2023 saw some really fascinating developments within the open source GenAI community.

Techniques including LoRA and PEFT have been responsible for a huge amount of excitement because they make it possible to fine-tune and perform inferencing on a consumer laptop. Thatā€™s fun, but itā€™s also important because it reduces energy demands and therefore CO2 emissions.

Quite recently weā€™ve seen the Mixture of Experts pattern deployed by Mistral.ai in their 8x7B model ā€” again, something thatā€™s delivering eyebrow raising good results on very modest hardware.

We are in a climate emergency and so anything that reduces the resource requirements of GenAI is a positive. Arguably, open source has delivered the most important area of GenAI innovation this year. Reducing resource requirements and therefore CO2 emissions is A GOOD THING. I believe this trend will continue and weā€™ll see open source contributing more such important advances in 2024. The need to make the technology work on modest hardware is a strong incentive to explore innovative approaches that large and well funded organisations donā€™t have. As such, open source will play a critical role in GenAI during 2024.

šŸ§‘ā€šŸ’» AI Architecture gets serious

In 2023 everyone was talking about which model to use.

In 2024 itā€™s going to be all about AI architectures ā€” good apps donā€™t just send input to a model and display the output. Instead, apps use a variety of different models and techniques, stitching together something much more sophisticated. Nobody knows for sure (because OpenAI donā€™t disclose this detail), but most of us suspect that GPT-4 isnā€™t a single model, but rather a marketing name for a set of capabilities that surround a core model. The architecture is important and will get increasingly more so. Fine-tuning, embeddings, RAG architectures, multiple models and more can all blend to produce complex systems.

No, not that kind of AI architecture!

The skills in demand are going to be those who know how to do this stitching, when to use which technology for what and which blind alleys to avoid. Traditional systems thinking, but augmented with a deep understanding of ML and AI, are going to be important skills in 2024 and beyond.

šŸ’Ŗ LLMs go to work

2024 is going to see some very interesting uses of AI that are highly tailored for specific industries, organisations and use cases.

2023 was all about foundation models, but 2024 will see these being put to work on real use cases.

I predict that ā€œExpert by your sideā€ solutions will begin to proliferate. Just in the way that I ā€œpair programmeā€ with ChatGPT, so will lawyers get to chat with their own legal GPT to develop legal arguments, automate the mundane and check case histories.

But of course it wonā€™t just be lawyers ā€” a myriad of professionals would value their very own AI experts to consult with. Imagine if call centre operatives had an LLM trained on their companies policies and how to do things. When the human gets stuck, they can as the AI for help. Imagine doctors checking their diagnosis and asking an AI for other possibilities. The uses are bounded only by our imagination.

I now canā€™t imagine programming without AI and other professions are going to feel the same effect. What this means for service quality, productivity and economic development is an interesting question. Some have predicted dramatic advances. I tend to think they are right, but itā€™ll take more than a year for that to come to fruition. Nevertheless, weā€™re on that journey. Iā€™ve seen how AI can have a massive impact on the profession of programming and see no reason the same cannot happen for other professions. Itā€™s just a matter of time and money to focus on what those use cases are. Lots will start to change and 2024 is, realistically, just the start.

šŸ­ Small models find a role

Until recently, small models were something of a curiosity ā€” useful to play with because of their low resource requirements, but not capable enough for serious use.

But Meta recently released their LLaMA Guard model, which is based on a small 7B LLaMA2 model tuned for moderation purposes. All it has to do is to decide if an input is a threat or not, so a small model is perfect.

This is genius ā€” using a small model for a very narrow use case makes perfect sense. It made me realise thereā€™s quite a rich set of such use cases ā€” moderation, anonymisation, intent routing and more. I see a strong role for using small fine-tuned models to augment a larger model. Theyā€™ll surround that larger model, weeding out and routing inputs to the correct place. 2024 will see a proliferation of this technique. Not everything needs benchmark-busting abilities, so small models will begin to find their place.

šŸ“² Inferencing starts to shift to local devices

Whilst I started this with predicting new breakthroughs from OpenAI, Iā€™m also fascinated by whatā€™s happening at the bottom of the ladder ā€” making models smaller and faster, so they can run on consumer hardware. Google has Gemini Nano thatā€™s planned to run on Pixel phones.

And Appleā€™s been doing a lot of work creating an optimised software stack that exploits the abilities of its proprietary m series processors.

The days of all inferencing being in the cloud are coming to a close. 2024 will see small and highly optimised, but still capable, models running on laptops, tablets and phones. Not everything will work this way, but a good proportion will. Those needing high reasoning abilities will still want to use something like OpenAIā€™s GPT-4. However, thereā€™s a significant class of uses that donā€™t need that. Most people are using the free version of ChatGPT and thatā€™s limited to GPT-3.5-turbo, a model whose capabilities are within the cross-hairs of todayā€™s open source models. Itā€™s not unreasonable to think such usage could run locally on our devices within the foreseeable future and the cost advantages of doing this are huge ā€” not needing to setup a massive compute farm is incentive enough to go down this route.

For the big cloud providers, this might come as something of a shock ā€” the expectation of large amounts of lucrative cloud servers running inferencing might yet be blunted.

ā†”ļø LLMs everywhere

Microsoft is charging premium rates for its Copilot product, which puts GPT into Word, Excel and Powerpoint. Itā€™s great, but expensive. I know several people trying to make the business case work and itā€™s tough without absolute proof of the productivity benefits.

Of course, Microsoft has to charge a premium rate to cover those meaty server costs that GPT models require.

However, once we have decent models running locally on our devices (see above prediction) thereā€™s no reason that needs to cost anything much.

Perhaps Apple will put on-device Siri inferencing into Keynote, Pages and Numbers?

The price advantage of local inferencing is going to drive a lot of adoption ā€” get prepared for LLMs everywhere and in everything! And, of course, hardware companies like Apple have a good reason to do this ā€” itā€™ll encourage us to buy hardware upgrades to get modern devices capable of exploiting the new AI features. Without the cloud inferencing costs, thereā€™s little to stop LLMs from being buried in anything they can add value to.

šŸŽ Apple emerges from the ML shadows

Apple has been something of a ML dark horse, but just a few weeks ago they released some very exciting open source software that optimises training and inferencing for its m series processors. In true Apple style, their MLX framework is a doddle to get running.

I strongly suspect this is the start of a significant AI push by Apple that will see an LLM-powered Siri running locally on iPhones, iPads and macs. With inferencing running locally, data privacy concerns vanish. And if Apple strikes content deals for training data (after all, they have the biggest cash pile of anyone), this might be the most ā€œmoralā€ LLM solution out there. In true Apple style, you donā€™t need to be first, you need to take the time to do it better. Cash to buy content deals and a highly optimised software/hardware stack that addresses privacy and green concerns is Appleā€™s moat and I expect them to wield it. Watch this space.

This strategy is also in Appleā€™s commercial interests ā€” the ability to run AI-Siri locally will likely be restricted to higher-end or recent devices that have the compute power to make it viable. Thatā€™s a reason people might want to upgrade their devices.

šŸ¦™ Meta gets serious

Meta is going to do something. Theyā€™ve got a first class ML team and LLaMA2 was one of the stand-out open source models of 2023, forming the basis of a thriving fine-tuning community.

But LLaMA2 wasnā€™t good enough to hit the big time on its own ā€” its major contribution has been as a basis for further fine-tuning by others.

But Meta has more potential than has yet been realised. In 2024 weā€™ll likely see LLaMA3 and I expect it to be more useful than as a mere base for others to fine-tune.

šŸ§‘ā€āš–ļø Regulation takes a back seat

Thereā€™s been a lot of talk and angst about the risk of AI in 2023. Weā€™ve had Biden Executive Order and the EU AI Act emerge as significant regulation proposals.

However, the EU have already rolled back on regulating foundation models and instead the EU AI Act focusses on how AI is used. In the US the Biden Executive Order defines high level intents, but leaves much to the imagination as to how this is implemented in practice. I strongly suspect the reality will be lower-touch than some fear ā€” the desire not to upset the applecart of economic opportunity will likely win out.

As a result, the talk about AI regulation will calm down in 2024 and regulatory oversight will be light for the foreseeable future. Nobody wants to hamstring their tech startups, giving national competitors an opportunity. Thereā€™s been a lot of talk, but actually stopping innovation and economic advancement would be a brave thing for politicians to do at this point.

šŸ„‡ Somebody gets serious about benchmarks

LLM benchmarks are a mess and, possibly, someone or somebody will start to do something about that.

Thereā€™s a proliferation of benchmarks, but almost all of them are weak. And model providers currently choose which ones to report on ā€” if your model performs badly on one, no problem, just use something else that sheds a brighter light on your product.

We need some form of ā€œquality markā€ that addresses benchmark failings and dictates which benchmarks need to be reported on. I expected one of the big quality organisations to step up and do something, but theyā€™ve been quiet so far. Something like the EU CE mark would be helpful.

Common testing standards, independently certified, would be a better start for governments wanting to ā€œbe seen to do something about AIā€ than ham fisted attempts to slow AI advances.

Thereā€™s opportunity in this space for either governments or private enterprise ā€” somebody will do something in 2024 (I hope).

šŸ—³ļø Deep fakes in elections

2024 will be an election year in both the US and UK. In highly charged political contexts on both sides of the Atlantic, weā€™re almost guaranteed to see AI used to generate fake content in an attempt to influence voters.

Weā€™ve already seen some convincing images of Donald Trump, the Pope and others. But theyā€™re fairly easy to spot and news outlets have highlighted their fakeness rapidly.

Whatā€™s interesting, however, is the role that such images can play even when we know they are fake. Donald Trump, for example, actively shared fake images of himself not to trick people, but because they contributed to an aura he was trying to create.

2024 will be the year that AI becomes a part of political campaigns, but not necessarily to dupe people. Rather, I expect AI to be used to pull on our heartstrings. That, of course, is something that either side of the political divide can exploit. How that works out will be interesting to experience.

If you found these predictions insightful, perhaps youā€™d find our services at Barnacle Labs of interest ā€” weā€™re an expert GenAI consultancy. You can contact the author, Duncan Anderson, on LinkedIn.

At Barnacle Labs we also host a github repo of useful GenAI content here: https://github.com/Barnacle-ai/awesome-llm-list

--

--

Duncan Anderson
Barnacle Labs

Eclectic tastes, amateur at most things. Learning how to build a new startup. Former CTO for IBM Watson Europe.