How TikTok Can Save Clubhouse

The 5 problems Clubhouse faces and how TikTok solves them

Dayo Akinrinade
Dialogue & Discourse
6 min readJan 4, 2023

--

Eyestetix

Audio has significant advantages as a medium over video and text. It’s more empathic, it’s easier to create, and it can be consumed passively, while doing something else, like jogging or cooking. But despite these advantages, the most iconic social audio startup of all — Clubhouse — has failed to break out.

Why?

Currently most social audio products largely adhere to a single format — the now familiar format that Clubhouse pioneered and that Twitter Spaces copied. In this format, a number of speakers talk on a stage while others listen from the audience. Listeners can raise their hand to join, and although the audio can be recorded for playback later, the vast majority of all listening time is live.

Clubhouse-Unsplash

While the ‘Clubhouse format’ is optimized to encourage conversation and sharing, the drawbacks to this format are readily apparent …

The Five Drawbacks of the Clubhouse Format

  1. Low listening value-to-word ratio — Podcasts are a competitor to social audio for passive-listening time, but generally, podcasts are produced shows with much content discarded on the cutting-room floor. As such, the live, unedited nature of social audio generally has lower-quality content per-minute versus podcasts.
  2. Live content is inherently difficult to discover — By definition, everyone on a social audio network cannot be live at the same time. To get traction, live networks must amass “user density.” The reason Twitter and LinkedIn can launch audio spaces in the same vein as Clubhouse is because they already have user density. However, even with millions of users, every user cannot be live and creating content all the time. The result is a timing problem — your favorite Creator is unlikely to be live when you are ready to listen.
  3. Even quality live content is mostly out-of-context — On social audio apps, listeners almost always join a talk in progress. As such, the listener lacks context on what was discussed previously. This leads to frequent “room resets” as talented hosts try to bring everyone along, but that in itself illustrates the problem. In contrast, podcasts provide very clear context, and the listener always starts at the beginning of an episode, ensuring the listener knows what to expect.
  4. Recorded social audio content loses the FOMO appeal of the live offering. Because of the nature of Clubhouse’s format of social audio, the beating heart of the content is live conversation. Because live conversation has a FOMO component, social audio content becomes less appealing the moment it is recorded. It’s almost like listening to a game live versus watching the replay the next day. When Clubhouse added recording of all live shows, there was no noticeable impact on growth . Additionally, the drastically variable talk lengths, often in excess of two hours make it logistically difficult to find time to listen to a talk in one sitting.
  5. The addition of the text-comments ‘backchannel’ surrendered the hands-free advantage of passive listening. One of audio’s biggest advantages as a medium is that it can be consumed passively; however, adding text comments to the core of the experience on Clubhouse had the unintended effect that both listeners and speakers must now continuously monitor the screen, complicating and confusing what had once largely been a hands-free experience.
Markus Volk

Thus Spake Zuckerberg

While these five drawbacks may be holding the Clubhouse format back, they need not hold back social audio as a medium. Meta’s Zuckerberg has recently said, “Audio will be a first-class medium,” and noted that audio is likely to have multiple formats that gain traction, just as other mediums do.

“There’s room for smaller, shorter audio formats.” — Zuckerberg on PressClub

For example, within text, social networks have grown up around short-form (Twitter) and long-form (Substack & Medium). Within video, networks have grown up around Live (Twitch) and recorded (Youtube) as well as short-form (TikTok).

While the format Clubhouse pioneered is great for fostering live conversation within a community, it is not built to produce high-quality content, the kind you might want to listen to later. Paul Davison, Clubhouse’s co-founder, admitted as much on a recent episode of the 20VC podcast, when he said:

“Clubhouse is about conversation not content.” — Paul Davison, founder, Clubhouse

On the contrary, TikTok’s genius is in producing a high-quality feed of user-generated content, content that commands more than 60 minutes a day from its users. It is about content, not only conversation.

Tyson

The 3 Lessons TikTok can teach Clubhouse

Now imagine if this genius were applied to audio… to create an alternative social audio format — one in which the content is curated in such a way as to be compelling enough to rival podcasts.

  1. Asynchronous audio allows the entire content library to be available to content discovery algorithms. While TikTok offers a Live feature, live is not integral to the TikTok experience, which makes it quite a bit different than live on Clubhouse. TikTok became famous for and defined by its asynchronous video format and for its seemingly magical AI. The AI and the asynchronous nature of the video go hand-in-hand. Only by having access to all content ever produced on TikTok, can the algorithm be optimal. The social media world is ripe for an asynchronous social audio startup with an optimal discovery algorithm to breakthrough.
  2. Short audio is easier to digest, easier to share, easier to create, and easier to discover — Short audio is easy for passive listeners to digest in the in-between moments, when you might just want a quick listening experience. With short audio, listeners know what length to expect and always listen from the beginning, solving the Clubhouse format’s out-of-context problem. Shorter content is also easier to share on alternative networks. In fact, videos created on TikTok account for a sizable portion of Instagram Reels and YouTube Shorts. Short audio is also more comprehensible to machine-learning algorithms, improving content discovery.
  3. Swipe-oriented design allows for faster algorithmic learning. TikTok’s design centers on the swipe because that design allows for a maximum number of signals to inform the content recommendation algorithm. Clubhouse is feed-based and not as algorithm-friendly as TikTok.

These three lessons from TikTok combine to solve the primary drawback of the Clubhouse format: the atrociously low listening-value-to-word ratio. The algorithm can curate a tiny fraction of exceptional listening from the vast amount of short audio files available to it.

Metri

While podcasts rely on planning and post-production editing to create high-quality content, this short-form social audio startup will rely on easy content creation to create vast amounts of content and will rely on a machine-learning algorithm to curate that content in a compelling way, leveraging swipe-based signals and speech-to-text data extraction.

My sense is social audio is on the verge of the next big thing. The success of TikTok suggests there must be an asynchronous, short-form, swipe-based social audio format that breaks through. It’s not a question of if but of when.

Dayo is the founder of the Wisdom app.

--

--

Dayo Akinrinade
Dialogue & Discourse

Founder @joinwisdom @africlick | @FT Top 100 Tech Leader | MSc Technology @UCLSoM | BSc Comp Sci @OfficialUoM ex @Accenture @Deloitte @thisisYSYS