Memory Leak
Published in

Memory Leak

Confluent’s S-1 Analysis — Forward Motion ⏩

Data in motion company Confluent recently filed its S-1 with $100M as a placeholder for the offering. Confluent pioneered the category of streaming data infrastructure based on the founders’ Apache Kafka project created at LinkedIn in 2011. Kafka enables real-time data from multiple sources to constantly stream across the organization. It one of the most popular open source projects in the world with 19K GitHub stars. Currently Confluent has 2.5K customers generating $263M in revenue over the last twelve months. Founded in 2014, Confluent has ~1.5K employees and is headquartered in Mountain View, CA.

Confluent’s S-1 emphasizes that all companies are not just adopting more software but are becoming software themselves. They claim rich and personalized customer experiences based on real-time data are the new imperative. For organizations to keep up, data sources need to be integrated in real-time to be relevant, and applications need to be able to react continuously at the speed of business. To accomplish this, businesses require data infrastructure that provides connectivity across the entire organization with real-time flow and processing of data, and the ability to build applications that react and respond to that data flow. Enter Confluent.

Other factors in addition to customer expectations are driving the need for real-time streaming. The proliferation of microservices creates the need for more connectivity between services. Also, IoT is becoming more prevalent driving significant data growth. As Machine Learning (ML) becomes mainstream it requires increasing amounts of streaming data to be effective. I called out streaming technologies as a key data trend to watch in 2020.

Traditional databases like data warehouses, relational, and NoSQL databases are best at managing data a rest that can be queried at a point of time. When building systems using traditional databases, teams must build separate point-to-point connections for every system that needs to be connected, resulting in a proliferation of connections. In turn, this can force companies to resort to periodic data dumps and batch processing.

Confluent claims data at rest databases are too slow to serve the real-time nature of modern customer experiences and operational needs and create “a big mess” of connection sprawl. Instead, data in motion flips this data at rest design 180 degrees. Rather than bringing queries to data at rest, Confluent’s platform is architected to stream data in motion through the query. This continuous stream makes the data always available and enables companies to tap into flows of data being generated anywhere in the company and continually process it. In turn, Confluent becomes a central nervous system that connects all disparate software systems, unifying their business and enabling them to react intelligently in real-time.

An interesting aspect of Confluent is that it has natural network effects. The S-1 states users typically start with an initial use case and grow over time. As more use cases are adopted, more applications and systems become connected, which then leads to more data in motion being processed by their platform. Streams of data naturally attract more applications which brings even more streams of data which creates a virtuous expanding flywheel. This network effect increases value to both individual participants and the whole organization.

Confluent offers two product solutions: Confluent Cloud and Confluent Platform. Confluent Cloud is a fully managed cloud-native offering, available on all of the major cloud providers (AWS, GCP, and Microsoft Azure). It leverages a serverless architecture and offers data compatibility with fully managed Schema Registry; rapid development through fully managed +100 pre-built connectors; and real-time processing with fully managed ksqlDB, a database that unifies the processing of data in motion and data at rest. Confluent Cloud is offered to customers via a pay-as-you-go model with no commitment, or via an annual, or multi-year, subscription model where customers draw down upon a committed dollar amount. Confluent Platform is an enterprise-grade self-managed software offering, able to be deployed on-premises as well as across public and private cloud environments. Confluent Platform is offered to customers via an annual or multi-year subscription.

There are a few other features that are important to highlight. Customers use Confluent Control Center (C3) to manage and monitor data in motion as it scales across the enterprise. It is a web-based Graphical User Interface (GUI) to understand the data-in-motion environment, meet SLAs, and control key components of the data-in-motion platform. Confluent also has multi-region cluster support, self-balancing cluster support, and tiered storage allowing deployments to recognize two tiers of storage: local disks and cost-efficient object stores (Amazon S3 or GCP Storage). Service offerings include professional services, education, and certification programs.

Today Confluent estimates it addresses a $50B market, taking spend from four Garner categories: 1) $31B in Application Infrastructure & Middleware (excluding Full Life Cycle API Management, BPM Suites, TPM, RPA, and DXPs), 2) $7B in Database Management Systems (excluding Prerelational-era DBMS), 3) $7B in Analytics and Business Intelligence (excluding Traditional BI Platforms), and 4) $4B in Data Integration Tools and Data Quality Tools (excluding other Data Integration Software). They calculate their total market opportunity will increase to $91B in these four market segments by 2024, representing a 22% Compounded Annual Growth Rate (CAGR).

There are numerous competitors. Fully managed alternatives include Azure Event Hubs (Microsoft Corporation), Amazon Kinesis and Amazon DynamoDB Streams (AWS), and Cloud Pub/Sub and Cloud Dataflow (Google). On premise alternatives include TIBCO Streaming, Cloudera Dataflow, Redhat (IBM) AMQ Streams, and Oracle Cloud Infrastructure Streaming. There are also other players the S-1 does not call out including Vectorized, Pulsar,, StreamNative, and

Confluent’s revenue is comprised of licenses (Confluent Platform and Confluent Cloud) and services. Over the past twelve months the company generated $262.7M in revenue, up 53% YoY. This YoY growth rate is above the top quartile publicly traded SaaS companies at 44% YoY but below many of the recent tech IPOs. Confluent achieved $236.6M in consolidated revenue in FY20, increasing from $149.8M in FY19, up 58% YoY. For the first quarter of 2021 ending March 31, 2021, Confluent’s revenue expanded to $77M, an increase over FQ1’20 revenue of $50.9M, up 51% YoY. Subscription revenue accounted for 87% and 88% of total revenue during FY19 and FY20, respectively, representing 60% YoY growth. Services represented 12% of revenue in FY20. Subscription revenue accounted for 86% and 88% of total revenue during FQ1’20 and FQ1’21, respectively, representing a 55% increase YoY.

In FY20, Confluent Platform achieved $177.2M in revenue, representing a 53% increase from FY19, and revenue from Confluent Cloud was $31.4M, representing a 117% increase from FY19. For FQ1’21 revenue from Confluent Platform was $54.1M, representing a 43% increase from FQ1’20, and revenue from Confluent Cloud was $13.9M in FQ1’21, representing a 124% increase from FQ1’20. Revenue from our pay-as-you-go arrangements represented an immaterial portion of revenue from Confluent Cloud during the period. For the last quarter Confluent Platform continued to represent the majority of revenue at 79% versus Confluent Cloud at 21%. Confluent is focused on becoming a SaaS company and is still going through the transition.

Revenue is derived from both U.S. and international customers. During FY20 and FQ1’21 international revenue represented 34% and 36% of Confluent’s total revenue, respectively. Other than the United States, no other individual country accounted for 10% or more of total revenue for FY19, FY20, and FQ1’21.

Customers continue to increase. As of March 31, 2021, Confluent had over 2.5K customers, up from 1.5K customers twelve months ago. There are 561 customers with $100K+ or greater in annual recurring revenue (ARR), up from 374 such customers in FQ1’20, a 50% YoY increase. The company has 60 customers with $1M+ in ARR, compared to 33 such customers for the same period last year, up 82% YoY. This data supports Confluent’s land and expand strategy. They have good penetration into large accounts with 136 of the Fortune 500 companies being customers (27%). The 136 Fortune 500 customers contributed approximately 35% of revenue in FQ1’21. Importantly, Confluent estimates that Apache Kafka has been used by over 70% of the Fortune 500. No single end customer represented more than 3% of their total revenue for FY20 or FQ1’21.

Confluent achieved a good twelve-month trailing net dollar retention rate of 125% for FY20, down from 134% in FY19. Management stated the decline in the net retention rate was primarily driven by the impact of existing customers becoming a larger portion of both our overall customer base and ARR, large initial deal sizes that incorporate potential growth, the impact of the COVID-19 pandemic, and the initial impact of existing customers transitioning to our usage-based Confluent Cloud offering. Confluent achieved 117% net dollar retention in FQ1’21. The median net dollar retention rate for a publicly traded SaaS company is 118% so Confluent is in-line.

Moving on to gross margin, which equals revenue minus the cost of goods sold that includes things like hosting costs and customer support. Confluent achieved overall 68% gross margins in FY20 up from 67% in FY19. Interestingly subscription gross margin slightly declined from 78% in FY19 to 76% in FY20 because of the growth of Confluent Cloud and associated third party cloud infrastructure costs. Services gross margin increased from (7%) in FY19 to 6% in FY20 because of the decrease in travel-related costs in light of COVID-19 travel restrictions. Our research suggests the median gross margin for publicly traded SaaS companies is 74% so Confluent is slightly below. As Confluent Cloud continues to gain steam affecting COGS, it will be interesting to watch how the company manages its gross margin.

Of each operating expense item, Confluent spent the most on S&M at 70% in FY20. Since Apache Kafka is open source, awareness and use of Apache Kafka typically begins before Confluent’s sales effort. Then enterprise sales take these initial engagements and works to monetize them. For Confluent Cloud customers can get started via a free cloud trial and easily convert online to become paying customers.

The company has a poor magic number of 0.56. A magic number of 1 suggests there is S&M efficiency. Our research suggests for publicly traded SaaS companies the median operating margin is (14%) so Confluent is below their peers at (99%) in FY20. In terms of net income margin, Confluent achieved (97%) in FY20, worsening from (63%) for the equivalent period a year earlier.

Confluent raised $456M in total funding from backers including Index, Sequoia, Altimeter, and Coatue. It last raised money a $250 million Series E round in April 2020 at a $4.5B valuation.

Confluent’s IPO registration touches on a few trends. First, every company is becoming a software business. Second, businesses are moving to real-time data streams to support customer experiences, microservices, IoT, and ML. Finally, data gravity and network effects allow data infrastructure companies to easily grow with users. After being private for seven years, it will be exciting to watch as Confluent goes public.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Astasia Myers

Astasia Myers

Founding Enterprise Partner @ Quiet Capital, previously Investor @ Redpoint Ventures and Cisco Investments