How to Find Success as an LLM Startup

9 min readFeb 6, 2024

By Chon Tang

A little more than 12 months after large language models (LLMs) exploded onto the scene, here’s a statement that won’t surprise any founder or VC: LLMs remain an incredibly exciting area for startups. SkyDeck’s early investment in LLM grants it a distinct vantage point on investor sentiment and insight into how founders can navigate the upheaval.

The ability for LLMs to “understand” and generate natural language is revolutionary, and it’s inevitable that applications leveraging this technology will change the way we work, live, and play. Founders are competing to build these applications, as well as the stack of infrastructure tools that enable them, and VCs, of course, are looking to fund them.

From the most conservative viewpoint, LLMs will be used to augment knowledge workers in every industry. The more aggressive view holds that the eventual arrival of AGI will fundamentally transform our society, making thousands of careers obsolete, while enabling the creation of thousands more.

SkyDeck is at the heart of this revolution. In addition to being part of the broader UC Berkeley ecosystem that has contributed to the creation of OpenAI, Anthropic, Anyscale, DeepScribe, and just about every serious LLM effort out there, Berkeley SkyDeck has long been focused on applied AI in general, and LLMs in particular. We accelerated MindsDB (backed by Nvidia, Benchmark, and Mayfield), and were investing in LLM startups before the launch of ChatGPT.

In 2023, we doubled down on this category:

We hosted the worlds’ largest in-person LLM hackathon, with 1,600+ hackers from universities across the world joining us to build revolutionary ideas. This hackathon was hosted in collaboration with OpenAI, Microsoft, Pinecone, LlamaIndex, Langchain, Sequoia, NEA, Lightspeed, and others. We invested in 3 of the ideas that came out of this weekend hackathon, and we can’t wait to host another.
More than half of our new accelerator investments are either enabling or leveraging LLM technology, and many of them have subsequently raised venture funding.

However, this is not meant to be another article debating the risks, potential, and future direction of LLM as a technology. Instead — leveraging our exposure to dozens of portfolio companies, thousands of pitch decks, and hundreds of funding rounds — we’ll focus on how investors think about opportunities in this rapidly evolving field just one year into this revolution, and the key questions founders should think about regarding their go-to-market and fundraising strategies.

DEV TOOLS

With any new technology, everyone immediately recognizes the need for tooling — we all know the value of “selling pickaxes during the gold rush.” Beyond the obvious investments into foundational model vendors like OpenAI, Anthropic, and Mistral, frameworks like Langchain, LlamaIndex, and MindsDB quickly absorbed large venture investments in Q1 2023, proving developers were coalescing around their ecosystems.

It’s not a surprise these were all open source tools; developers are more willing to embrace them, contribute to them, and rely on them. As such, the primary KPI for these investment rounds were GitHub stars and community engagement.

But, investors are now asking key questions:

The number of new releases on GitHub for every aspect of the tech stack are now overwhelmingly crowded — agent frameworkers, vector databases, model observability, data handling, even foundational models. How does any new solution cut through the noise of so many new releases? How does ANY new framework acquire a critical mass of developers?
Are frameworks anything more than “prompt wrappers”? Are they solving fundamentally difficult problems, such that developers don’t feel like they should build their own stack on top of OpenAI APIs?
How do open source frameworks successfully monetize their solutions? The number of actual commercial-ready LLM applications are still low, but even as they grow in number, what are the levers that allow the open source frameworks they rely on to actually monetize?

A few suggestions for founders as you navigate this space:

Avoid simply being a slightly better mouse trap. In a world where everyone is bombarded with discord DMs and emails about the launch of a better mouse trap, adoption is the challenge. Look to be 10x better.
Look for new pain points as the first generation of applications begin to mature and head into production. Eventually, writing quick demos that only work once is no longer what keeps developers awake at night; reliability, manageable ops, instance scalability, managing LLM version control, and handling multiple models is key.
Think about monetization early. Building a tool that developers are happy with is no longer enough, showing that you have a vision for how you’ll generate revenue is increasingly important. Will it be via services? Hosted inference? Acquiring/selling data? Marketplace?

INFRASTRUCTURE

Beyond dev tools, infrastructure was another key pain point in 2023. We all know Nvidia struggled to keep up with demand as computation became critical, and we all experienced the pain of finding or affording a GPU cluster.

If you weren’t involved in training, OpenAI was gracious enough to offer enough inference API credits for many application developers to get off the ground, but we all know that spigot will soon be turned off. Increasingly, many companies are starting to look at significant credit card bills.

Where there is pain, startups inevitably rise up. There was a surge of GPU-specific cloud vendors (beyond AWS / Google Cloud), followed by a wave of inference-only API vendors that abstracted away the challenges of working with a wide range of foundational models.

Yet, investors now have questions about where this category is going:

Are foundational models even a good business? Commercial vendors have raised an obscene amount of money to buy GPUs in hopes of winning the “biggest model” arms race. Microsoft + OpenAI vs Google vs Meta… is there room for anyone else on the high end?
How does anyone monetize? On the low end, open source models (especially driven by MoE and LoRA) are showing remarkable performance for a wide range of applications and there are hundreds of players contributing fully open sourced solutions.
Will the hosted solutions be anything other than a race to zero? Nvidia is the big winner from all of the venture-backed spend in this category — will everyone else be reduced to competing based on their ability to optimize for electricity costs?
AWS and Google Cloud will be massive players — who will they work with, who will they compete with? We haven’t even seen the massive players in the cloud infra world really flex their muscles.

APPLICATIONS

Of course, all of the developer tools only have value if and when the applications they empower can create economic value for the end user. Applications are the ultimate driver of this industry.

It’ll be impossible to comprehensively capture all of the various problems that people are looking to solve, or the markets they are looking to investigate.

But there are notable trends I’d highlight:

You’re not unique.
A year ago, we had 2 LLM applications (out of 20 companies) in our batch at SkyDeck. 6 months ago, we had roughly 12. At this point, I’d say the vast majority of our (non-biotech, non-robotics, non-hardware, non-hardtech) software startups rely on LLMs as a core part of their techstack.

All of this implies that simply building a LLM application is no longer interesting, but a given, an expectation, a must. It represents the kind of disruption in technology that only happens once in a generation, and it’ll be as ubiquitous as the rise of personal computing or databases.

LLM demos aren’t enough.
With all of these LLM applications, investors also have enough data to conclude that while LLMs are easy to run demos with, they are exceptionally difficult to build commercial applications with. It turns out that models that answer questions correctly 85% of the time lead to a confusing experience with customers, who’ve come to expect our technology to react in a deterministic, consistent manner. When we turn on the dishwasher, we expect it to wash our dishes 100% of the time.

Founders aren’t expected to fundamentally solve this challenge, but is the use case you’re tackling sufficiently important (to *your* customers) that they’re willing to put up with these minor inconveniences? Or can you build a workflow that cleverly obscures this behavior, such that your customers don’t have to rework their expectations completely?

Investors will be looking for proof you have answers to this key question. In short, fundraising on the basis of a concept, no matter how sexy, is increasingly unlikely. We’ve tracked enough LLM applications over the past 12 months to know that the hard part isn’t imagining the use case, but rather executing on the product itself.

LLM startups are just startups.

On a highly related note, in most ways, you should expect your company to be evaluated like any SaaS startup was being evaluated over the previous decade. You’ll need revenues, you’ll need usage. If you’re looking to raise a $3–5mm round, you might expect to have 1–3 business customers ready to pay and be on track for $100k+ ARR.

Or, if you’re building a B2C solution, then the usual numbers like CAC, 4-week retention, and cohort analysis are as relevant as they ever were.

Pilot contracts need to be scrutinized carefully.

Except… sometimes, LLM startups have it even more difficult than non-LLM startups.

Traditionally, if a B2B software company is able to excite their enterprise customers enough to convince them to pay upfront for a pilot, that’s a strong signal for investors. Your product must be solving a serious problem with a compelling value proposition, or it would have never made it through the typical challenges of an enterprise contracting process.

However, in the LLM era, enterprise interest is no longer as strong of a signal as it used to be. CEOs read the same news headlines that we’ve seen, and everyone in the C-suite has their eyes peeled open for the LLM-powered solutions that will fundamentally disrupt their industry. Cold inbounds regarding yet another CRM or data analytics tool might go unread, but companies that sprinkle LLMs into their sales pitch get business customers excited.

But investors with LLM startups in their portfolio now have plenty of examples of pilot deals that never converted into actual commercial contracts. Business customers still need to see a positive ROI for the software solutions they pay for, but they’re just willing to suspend their disbelief long enough so that the decision point has been pushed back a few months — after the pilot has been completed.

Data isn’t a moat

For traditional deep learning, we knew that data (quality and quantity) was critical. Startups could successfully tell a story that by accumulating data, they’d have what appears to be a lasting edge in training a model that would be superior to those found by their competitors. This story could be validated in companies ranging from autonomous driving to TikTok.

The first generation of LLM startups borrowed the same line. By getting deep within a vertical, by gathering, cleaning, and understanding real world data, startups would build a massive moat over future competitors through better performing applications.

12 months in, that doesn’t yet appear to be true. The value of fine-tuning on LLM behavior is still present, but it’s no longer clear the trade-offs are massive — especially when synthetic data (also generated by LLMs!) appears to be very effective. At the same time, few-shot and no-shot RAG applications require deep insight, but not necessarily large amounts of customer data.

Of course most developers are just focused on building a successful application right now, but investors need to consider where the market is headed 3–5 years from now. Even if a portfolio company successfully proves a high value use case for a massive vertical market… does that help if copycats can spin up compelling alternatives with ease?

We are now in year two of the LLM revolution, and I’m sure many of my insights will look silly in hindsight. I look forward to being educated by the founders building the solutions of the future. Come to SkyDeck, and we’ll help you think through all of these considerations together and carve out a story that convinces institutional VCs to join your journey!

Chon Tang is the Founding Partner of the Berkeley SkyDeck Fund. If you’re interested in learning more about Berkeley SkyDeck or applying, visit: skydeck.berkeley.edu

How to Find Success as an LLM Startup

Written by Berkeley SkyDeck