AI Laws, Elections, and Quantum: 10 Predictions for Data in 2024

Rafael Guerra
9 min readDec 10, 2023

--

Well, we’re reaching the end of 2023. There is a lot that can be said about this year, and unfortunately, some of it is pretty sad. But this was also a year of life-changing data science innovation — a year where we got a preview of the possible disruption and excitement of artificial intelligence, now more visible and relevant than ever before. In this post, I gather my research and personal thoughts to make 10 predictions for data science in the year to come. So without further ado, buckle up — here are my hot takes.

1. For better or worse, expect AI to be more widely used in government — and legal writing at large.

Earlier this year, Brazil passed its first legal ordinance that was, unbeknownst to the city council, entirely written by ChatGPT. The instance may have been a purposeful attempt to show how good OpenAI’s system has become at mimicking legal writing, but I’d expect a lot more legal text to be AI-generated from now on. It will make the news the first few times, and then it will no longer be a big deal. While it may seem spooky at first that large language models are used in writing legislation, consider the upside. Most legal text — including legislation — is so dense that most of us can’t understand it at all. A simpler, more easily summarizable legal text could be a good thing for general legal literacy, not only in what concerns legislation but any legal documents at all. I certainly wish I had a nice AI tool to explain my mortgage documents a few years back.

“We, the council members, propose that {ERROR: Invalid credentials}”

2. AI-powered tools will discover new drugs in 2024, but long approval processes may not make them available just yet.

Drug discovery is expensive. Companies have to conduct extensive research and testing before they can introduce a treatment to the market. That’s a great thing, of course — we don’t want drugs that are ineffective or make us sicker! But could the approval process be adapted if discovery is rapidly accelerated by AI? That remains to be seen. Many companies are innovating the drug discovery space (shout out to my friend Marcel and his team at Ten63) in ways that can significantly disrupt the discovery lifecycle and I would not be surprised if a major disease sees a breakthrough treatment — or even a cure — in 2024, largely helped by AI.

I’d very much like to live past 100 years, ideally, free of disease — and even more ideally, eating as much sugar and saturated fat as I want. Oh, and if we could cure social anxiety, that would also be great!

3. As more people learn to spot AI-generated content, the value of high-quality human-created content will increase.

I don’t know about you, but I am getting tired of vendors cold messaging me with generic AI text with minimal human thought. It is easier than ever to write copy for an entire website with AI-powered tools. It is also, in many cases, a bit lazy. I think some of the novelty and awe around AI-generated text will fade, at least for B2B and B2C communication. Instead, text that is clever and human will win. We may even start liking seeing a typo or two in our messages — though it would be a weird overcorrection if messages start including them on purpose to seem more human. It’s totally going to happen though, isn’t it?

If I had a dollar for receiving terrible AI-written LinkedIn InMail messages this year, I’d be a dollar richer.

4. More accurate tools for AI-generated content detection will emerge — but not in time for the new, more sophisticated models to come.

Beloved teachers and educators have messaged me this year with the same question time and time again: can I trust ZeroGPT to catch students' use of AI in essays? My answer was no. I mean no disrespect to ZeroGPT — they are trying their best. I am not aware of any tools being accurate enough that we can know for sure whether someone is using AI explicitly. And if they used it implicitly, to do research, or to bounce off ideas, would that be a bad thing? I am certain ZeroGPT will iterate and improve, and other tools will come that may even do a better job at catching patterns. But by then, GPT-5 and other models will come out and accuracy metrics will probably drift. In general, it’s becoming more obvious to me that assessment, particularly in essay-based classes, will have to change. Maybe it’s time to go back to on-the-spot essay writing. Maybe it’s time for hand-writing again. Maybe AI should be encouraged for spell-checking and research, but novel and creative analyses should be rewarded. I’m not an expert in the field, but this is a great time for education researchers to shine!

I feel for teachers, but also for students. I would not be happy if a web tool flagged my original essay about Hamlet being a metaphor for adolescence as AI-generated. I was lucky enough to have teachers who encouraged out-of-the-box thinking. Hopefully, that mindset becomes the norm and not the exception.

5. Vector databases are going to be more widely adopted and working with it will become an in-demand skill.

Earlier this year, I wrote about word embeddings — how AI models convert words into numbers and compute them. It was my most viewed article. As word embeddings become more popular due to language model adoption, better architecture will emerge to store them. Vector databases, in particular, seem like a popular solution that will likely gain even more adoption in the year to come. So, if you are a data engineer or data scientist thinking about a new skill to pick up, this may be a good one that might not get as much buzz — but will be behind a lot of innovation and efficiency in the space.

This is not what a vector database looks like. But it does look cool and futuristic, doesn’t it?

6. We’ll start to hear about quantum computing being used in data science, though it’s unlikely breakthroughs happen in quantum machine learning anytime soon.

I am a little biased on this one, but my intellectual curiosity for quantum mechanics has made me actively research when and how quantum computing could be used in machine learning. Indeed, the parallelization aspect of coherence (maybe a topic for 2024?!) could be a game changer for data science modeling, but unless one of the big tech firms has some secret project going on in their quantum divisions, we’ll probably not have the hardware or software algorithms for it quite yet. It’s never too early to start dreaming about it, though — and investors are going to need new buzzwords next year, so quantum could be a good one!

If you’ve never heard of wavefunctions, superposition, or entanglement, buckle up. When quantum goes mainstream, the jargon-fest will be unlike anything we’ve ever seen.

7. Politicians will leverage LLMs and generative AI in their political campaigns. Some use cases will be questionable.

This one seems so obvious, it is barely a prediction. We saw what happened in 2016 when our technical abilities moved faster than our social maturity. We have even more impressive technology now, but not as impressive of an education apparatus for spotting and countering bad usage of said technology. There’s hope though. The earlier we talk about this and educate our family members — particularly those prone to believing fake news — the more likely we may be to minimize the damage of large-scale AI-generated misinformation that may look or sound uncannily real. It may also be the case that some uses of AI in politics could be beneficial. For one, it makes both the good and not-so-good aspects of them more visible for us all to see. But also, I can see the appeal of creating a chatbot that can interact with voters and answer questions in a way that feels more dynamic and approachable than a cold email.

Lol?!

8. Many companies will continue to benefit more from basic statistics and good data warehousing than they will from engineering neural models from scratch.

With high-quality large language models only one API call away, there’s not as much of an incentive for a lot of companies out there to reinvent the wheel and make everything in-house. Particularly when that could entail training complex models that are not easy to debug or explain. Instead, more companies would benefit from good back-end engineering, statistical literacy, experimentation, well-documented unit tests and AI safeguards, sound product science, and yes, a well-structured data warehouse or data lake. Lack of awareness of existing data — or worse, bad data altogether — will be far more prejudicial for companies in this year to come than missing out on their proprietary model that might or might not be better than a generally available commercial model already in existence.

Business decision-making would be so much better if everyone understood what the p-value truly means — and most importantly, what it doesn’t!

9. Data science, as a field, will fragment a bit — new job titles will emerge and some will become less common.

It’s awesome to be in data science, and that awesomeness will continue in 2024. But companies know what they want now a little better than they did in the data science boom of the pandemic. New job titles will emerge to make that differentiation more visible. Some job titles may not have as exciting a compensation package as an average data scientist makes today — but overall, I think more job titles in data science are a good thing. This is due to the fact many data professionals are often confused by the expectations placed upon them — often asked to think like a statistician, code like an engineer, and communicate like a business manager. Such professionals will still be highly valued in the market, but they will probably gain a different title. Data scientists, on the other hand, will become a more traditionally technical position, with a higher degree of emphasis on software engineering skills. I don’t personally bet on ‘Prompt Engineers’ becoming a mainstream title, but I do think ‘Machine Learning Engineers’, ‘AI Product Managers’, ‘Product Analysts’, and a range of AI compliance and regulatory positions will come to market and be quite hot.

Funny enough, you can see the Schrodinger equation in this stock photo — an essential part of quantum mechanics. Maybe I am onto something if the stock footage for ‘data scientist’ is already showing that!

10. No, we still won’t have AGI next year — and that’s probably a very good thing for now.

The Succession-like drama at OpenAI last month ended with some speculation that Sam Altman was working on a secret model that was rumored to be closer than ever to Artificial General Intelligence (AGI). As far as we know, that wasn’t true. There is very little we understand about what it would take to build AGI, in part because we still have iffy definitions of intelligence and the different ways to measure it. We’re probably not as close as we think to AGI — and that is probably a good thing for now. We are not ready. We have a long road ahead to ensure society-wide data literacy and to discuss possible safety nets in reaction to economic shocks. It will be an exciting day when AGI is here — but for now, we should use the time we have to prepare.

I have already made too many references to my beloved canceled show Westworld before, so I will refrain from making another here, in a mere caption. Or will I? #ifyoucanttelldoesitmatterbernard

There’s more I could say, but ten seems like a good number for an online end-of-year list. What do you think of the predictions?! Are there any items you’d add? Let me know — and thank you for your readership. Let’s be hopeful for a 2024 that is just as exciting and innovative, but far more peaceful. And though there’s no reason we couldn’t have it all, may we choose the latter if we can only pick one.

--

--