GPT-4o to ScarJo: Here’s what devs need to know

Fahim ul Haq

Published in

The Startup

11 min readMay 30, 2024

AI has been dominating the news this month — with privacy, security, and ethics concerns front and center.

Let’s cut through the noise and boil it all down to exactly what devs need to know.

I’ll cover:

5 key AI stories developers should be following
Unpacking critical AI trends in the tech industry (and predicting what comes next)
What developers need to know to stay ahead

Let’s dive in.

May AI News Recap: 5 trending stories to follow

Lately it feels like every news story I’ve seen is about AI. Interestingly, most of them share a common theme: privacy, security, and ethical AI use. Before we dig into the impact for developers, I’ll quickly summarize a few trending stories you should definitely be aware of.

GPT-4o
OpenAI turnover
Sky & ScarJo
Microsoft Copilot+ PCs
NVIDIA earnings

Let’s break it down.

1) GPT-4o

By now I’m sure you’ve seen the news: just last week, OpenAI rolled out their most advanced model yet.

There isn’t much to say on this topic that hasn’t already been said. But from what I’ve seen so far, 4o seems very impressive, especially with its real-world interactive abilities. Notable features include:

Improved text and image/video recognition capabilities
State-of-the-art audio speech recognition
50+ natural languages covered
More lifelike response time and personality in its 5 original voices (perhaps too lifelike… more on that in a moment)

All of these factors amount to what is likely the most powerful model in the world today. It has also made me stop to consider the immense potential for LLMs capable of being trained not just on text but on video data, as well.

GPT-4o’s splashy entrance resulted in increased mobile app downloads, and an associated jump in revenue for OpenAI. CEO Sam Altman also announced that they will be rolling out new features iteratively, so keep an eye out for more updates.

2) Open AI Turnover

With the arrival of GPT-4o, OpenAI proved that they are still the undisputed leaders in generative AI (for now). But it hasn’t all been gravy lately for OpenAI.

Co-founder and chief scientist Ilya Sutskevar left the company last week. He was also a key member of the board contingent that tried to oust CEO Sam Altman last year.

Sutskevar was followed by Jan Leike, who headed up the superalignment team, the group at AI largely focused on ethical AI use and societal impact — which has promptly been dissolved less than a year after it was founded.

Leike’s rationale for leaving sounds similar to that of others who have left OpenAI, citing security and ethics concerns and philosophical disagreement with the direction of the company.

In other words: new person, same story.

The “drama” at OpenAI isn’t so different from what many relatively early-stage/high-growth companies experience, so this turnover isn’t unprecedented (just at a slightly higher profile than most). But it’s still worth keeping an eye on, especially as each prominent individual who leaves OpenAI cites essentially the same reasons for doing so.

Of course this OpenAI story has quickly turned into a footnote compared to the next one…

3) OpenAI’s Sky & Scarlett Johansson

As I mentioned before, GPT-4o launched with 5 voices… and if you’ve ever seen the movie Her, one of the voices may sound eerily familiar to you.

Long story short, Sky, one of these new GTP-4o voices, sounds uncannily similar to the actress Sacrlett Johansson, and the backlash has been severe.

There is a whole can of worms here around regulating deepfakes; who owns the rights to AI-generated content created using the likeness — or even merely approximating the likeness — of celebrities who haven’t given their consent? We have already started to see this play out with AI-generated music with FKA Twigs’s congressional testimony, and now the debate has been kicked into an even higher gear with the Sky fallout.

If there’s one thing we know, it’s that there is an appetite for AI regulation in California. SB-1047, the most comprehensive piece of AI regulation in the US so far, recently passed in the state. And in Hollywood, we have already seen lengthy writer and actor strikes in the past year, largely precipitated by these same concerns.

I’ll talk more about the downstream impacts of these early attempts to regulate AI later on. As for now, I will be curious to see how this story develops, and the extent to which AI conversations continue to penetrate the mainstream.

4) Microsoft Copilot+ PCs

This is also a developing story with interesting downstream impacts. Microsoft recently rolled out a new line of AI-enabled laptops, using a Qualcomm-built processor (as opposed to Intel). I haven’t gotten my hands on one yet, but I will be curious to see how they catch on.

I think this is worth mentioning because we’ve seen privacy and ethics concerns start to creep into this conversation, as well. Through its new AI tool called “Recall,” Copilot+ PCs are capable of taking screenshots every few seconds, but reportedly the data is encrypted and only stored locally.

For any employee using a company-issued machine, the screen capturing technology should be cause for further scrutiny — but we’ll see how the story develops, and whether the alarm is actually merited.

5) NVIDIA Earnings

I wasn’t originally planning to talk about this, but the earnings report forced my hand — NVIDIA just announced some substantial Q1 earnings, capped with a 10–1 stock split.

What does that mean in practice? To put it bluntly, not much. It just makes the share price a bit more palatable to the everyday investor, and signals confidence in NVIDIA’s profitability and growth trajectory. One thing remains true: as the AI industry continues to boom, chipmakers stand to reap the rewards. I don’t see that trend slowing down anytime soon.

What does this all mean for developers?

There are two ways to slice these developments. One is from an industry perspective — i.e. who’s winning, who’s losing, and what comes next. The other is from an individual’s perspective — i.e. how does this affect developers in a practical sense, and how can we optimally prepare ourselves for an AI-driven future.

It’s important to be aware of both sides. I’ll share my actionable advice for developers at the end, but first, let’s start by unpacking a few critical macro trends in the technology and business landscape.

Unpacking the AI landscape (and predicting what comes next)

We are watching a seismic shift in the tech industry play out in real-time. Every day, AI is becoming more integral to how products are built and what users are increasingly expecting products to be.

In other words, companies big and small are reading the writing on the wall around AI. When it comes to differentiation, there are rapidly becoming two segments: AI-enabled products and legacy products. From an investor’s perspective, legacy products are a death sentence. AI is the future, and if you’re not already on the train, it’s too late. I think users will start to feel similarly sooner rather than later, too.

That means every company has a massive challenge on its hands to recalibrate and transform its product and processes in order to stay viable in an AI-driven world.

With this in mind, each of the news stories I mentioned previously shares a common theme: it’s evident that every tech company is feeling the pressure to incorporate AI and are scrambling to move fast — perhaps without thinking through all the downstream impacts. Recently, we’ve been seeing this urgency play out in clumsy and chaotic ways.

Just look at Slack; the other week they randomly announced that they would be using customer’s private conversations to train their own AI, without an easy process to opt out. If you are a large company processing a ton of data, this is not an easy issue to navigate (and in some cases, could result in a GDPR violation), and the backlash for Slack has been strong.

The main takeaway here is this: companies don’t tend to pull shenanigans like that unless they are feeling a bit desperate. On a similar note, most privacy concerns surrounding Microsoft Copilot+ could have been avoided just with better documentation and upfront communication around how Recall actually works.

It seems indicative of the frantic climate that seemingly all the major players are overlooking basic privacy and security-related issues. Or at the very least, in their push to move fast and not get left behind, they simply aren’t taking the time to clearly communicate this information to customers, who are of course feeling their own form of AI anxiety. Either way, it’s not a great look.

Additionally, the ScarJo faux pas is the latest and biggest example of AI ethics concerns fully entering the mainstream. Celebrities are now embroiled and trying to navigate this very complex world. There are a lot of fascinating questions raised, like, who actually decides whether a voice like Sky’s is “similar enough” to Johansson’s, even if the model wasn’t trained on “her” specific voice?

Public figures whose success is associated with the current formation of the copyright law are feeling the pain a bit. Rightly or not, they think AI is enabling people to circumvent protections afforded by copyright laws. So, they are scrambling to protect themselves, as legislation still lags behind.

Yet diving deeper into that California bill (SB-1047), I have found it to be strangely worded — at least in the sense that it is putting a lot of onus on companies building AI products (and devs who are leveraging AI APIs to build AI-enabled products) to limit themselves to the point that using AI at all may not be possible without putting yourself in grave legal danger. I understand that’s not the spirit of the law, but it will likely stifle innovation. But as companies push the envelope to stay relevant with their own AI-enabled products — perhaps overlooking basic privacy and security concerns as they do — it could serve as a bit of a wakeup call.

OK — so who wins the GenAI arms race?

Of all the players at the moment, I remain most impressed with Microsoft. They have adopted a two-pronged AI strategy, as they scale their own AI division led by Mustafa Suleyman, while still remaining the biggest sponsor of OpenAI.

Satya is partnering with the best of the best today (and GPT-4o is definitely the best), while Microsoft invests in their own fully proprietary, self-hosted models. This approach gives them lots of optionality in terms of cost, while remaining above the OpenAI drama (which, let’s not forget, is still hosted on Microsoft Azure data centers). Because of this dual strategy, Microsoft is well-positioned to be the leader in the coming years.

That said, Google and Meta both have a key advantage that Microsoft doesn’t: they can fall back on ad revenue to fuel their growth. For as long as consumers see their time (or data) as less valuable than their money, these businesses will have rocket fuel. Want a great example of this? Look at Netflix — their stock is way up since introducing ad-supported plan, once-again proving the viability of an ad-driven approach, which has been adopted now almost ubiquitously across the streaming industry. Google and Meta will always have that ad revenue to help them capitalize on whichever AI bets they want to make, which is a huge advantage.

OpenAI, on the other hand, needs to monetize their model and APIs in order to grow. For that reason alone, in the long run, I wouldn’t count out Llama (Meta) and Gemini (Google), as these trillion-dollar companies set their eyes on the generative AI prize.

Here’s what developers need to know to stay ahead

Now let’s boil everything down to what this means on a practical level for developers. This brave new AI-powered world is coming, whether we are ready or not.

So, as developers, what can we do to leverage AI intelligently, while staying competitive in a rapidly changing industry? The good news is that it’s actually pretty straightforward.

From an upskilling perspective, it’s critical to start building AI fundamentals as soon as possible.

You should definitely understand the building blocks of generative AI. These include concepts like LLMs, tokens, transformers, and ML concepts like neural networks. Then you need to have a working knowledge of AI implementation: e.g. understanding OpenAI’s API, or learning how to leverage models through RAGs (retrieval-augmented generation). You will need to learn about this stuff eventually, so the sooner you do it, the better.

I recommend starting with a course like this one: Modern Generative AI with ChatGPT and OpenAI models.

Educative also offers plenty more immersive generative AI courses, where you can get hands-on building and training your own models, as well as learning how to leverage APIs and RAGs to develop AI-enabled products.

One more thing every developer absolutely needs to be aware of: privacy and security.

At small companies and big companies alike, privacy is paramount. With legitimate concerns around protecting user data (with severe backlash if handled carelessly, as we’ve seen), it’s important to be extra mindful of privacy when building AI-enabled products. If you’re leveraging AI APIs on the job, be sure to read the documentation correctly. OpenAI has guaranteed that they won’t use public data to train their models, so that’s a safe bet for now. However, if you or your company is leveraging other models, look at their documentation and ensure that they aren’t using any data that shouldn’t be used to train publicly available models.

Lastly, here’s the most important thing for developers to remember: the fundamentals of building great applications won’t change, whether AI is used or not.

Users still want their problems to be solved in a fast, efficient way, while making sure that their security and privacy is taken care of and top of mind. This remains true, no matter the modality of the application — mobile, web, desktop, and beyond. Take for example Microsoft Azure Table Storage vs. Amazon DynamoDB. Both are NoSQL databases with a few differences around implementation, but the building blocks and fundamentals are more or less the same.

I do think any developer working on enterprise-scale applications should also start looking seriously at Llama, which offers a lot of optionality around hosting.

This is a good way to ensure customer data won’t touch Open AI or Microsoft servers (note that you would have to host it yourself, or find a third-party hoster). Apple even came out with a model a few weeks ago, called OpenELM — with surprisingly little buzz, at least by their standards. Consider checking them out, too.

The only company that has been missing out to this point is Amazon — so I’d expect them to debut their own model soon, or at least a very streamlined hosting option for models like Llama. I’d also keep an eye on Cloudflare, because it’s likely they’ll feel the squeeze as they try to provide better services for application developers.

At the end of the day, things may seem overwhelming. There is a lot of chaos in the industry and a lot of information to be aware of. Just remember this: the landscape is new, and the skills may look a little different, but the fundamentals from a developer’s perspective are the same.

Keep growing and you’ll be fine.

Happy learning!