{Title} goes here

5 min readJan 10, 2024

Notice something off? Remind you of email spam? Time to rethink it.

We spent the last couple of weeks (technically the last 5ish years) exploring short-form copy. This time, we zoomed out, drawing parallels across the industry on how it’s crafted and delivered to users.

What is short-form copy?

Phrases shorter than 30 words long — such as SEO descriptions, email subject lines, button text or taglines on a creative ad.

Why does short-form copy matter?

With so much generic content produced by AI today, brevity & on-point relevance are more important than ever. Tiny tweaks in content, such as email subjects, SEO snippets, and button text, can significantly boost traffic and conversions. Even small changes like adding a new word or using Gen Z slang have led to millions in daily active user gains for companies.

When you see the notification “Elon just t̶w̶e̶e̶t̶e̶d̶ posted,” remember we carefully inserted “just” because millions more opened it with that word. It was a small tweak, but it made a big impact.

Sounds like, just words matter? (no pun intended).

If these changes are clearly (and counter intuitively) very impactful, why aren’t companies going ham on copy experimentation?

As a matter of fact, 30% of A/B experiments are text changes, but let’s take it a notch further. Why are we not able to see a world where every single user sees content that’s uniquely tailored to them?

Enter the long content iteration cycles.

If you have worked in Marketing or Growth, you are already smirking. Every small change in content directly presented to users, goes through a 1–3 month (sometimes 6 month) cycle before it’s shipped. The iteration cycle is a flavor of these steps:

Step 1: Writing copy variations

Let’s go back to our notification example: “Elon just posted …”.

Traditionally, a content writer would take 1–2 days to write variations of this copy.

“{first name} just posted”, “{first name} recently posted”, “Recent post from {first name}”. You get the idea.

With Gen AI, probably a few hours, after several prompt iterations.

Step 2: Reviewing and approving the copies

The larger team is tagged for reviews in a google doc and this is how the exchange goes-

Product: “Doesn’t even appeal to the user. Have you seen Duolingo notifications?”

Trust & Safety: “Remove the 32 sensitive terms and then we are gold.”

Legal: “[ACP] Let’s trade off the humor for plain language given the reputational risk”

Marketing: “The branding is off. We sound like a machine, can we stick to our voice?”

Localization expert: “The translations don’t quite catch the nuances of French, can’t ship in french speaking markets. Can we crowd source these to native french speakers?”

Engineer: “Umm, the remaining copies don’t even work. We have already experimented with them last quarter. Can we iterate again?”

Step 2 → Step 1 → Step 2 continues and can eat up months.

So we asked all of these folks, how do you evaluate anyway, which variants are “good”?

“You know, it’s subjective” “We just eye ball and pick a few” “We add ad hoc rules like — don’t use exclamations.”

“Objective content quality evaluation doesn’t quite exist, we are building some quality principles.”

“This was not a problem when we had limited copies from writers. Gen AI scaled generation of content but it doesn’t scale for quality. It’s like picking a needle in a haystack of highly generic copies.”

Step 3: Integrating it with the experimentation code

Integrating every small copy change in code takes time. Here’s why —

Stitching dynamic content with static content: Integrating dynamic content (e.g., {first name}) with static content (‘just posted’) requires dynamic key incorporation during copy creation, pulling values at runtime for coherence, and stitching them together so the overall phrase is coherent.
Rotating copies across users is an unsolved problem: Determining the number of variants per experiment, balancing novelty with consistency for each user, and assessing the effectiveness of algorithms like Multi-armed bandit is a hard nut to crack.
Latency and scalability: Every copy variation needs to work with low latency calls to serve users in real time.

Step 4: Running the experiment

This takes 2–4 weeks depending on the traffic so the experiment gains statistical power.

Step 5: Analysis

The stakeholder group reviews experiment metrics and records significant insights in a Google Doc, Excel sheet, or launch email. Occasionally, a disciplined engineer may include these details in a log of all learnings stored in a separate Excel sheet. However, these valuable insights often remain as tribal knowledge within the engineering, product management, or marketing team and do not automatically inform the next content iteration programmatically.

Step -1: Starting over

It’s a fresh quarter and we are ready to run our next copy iteration. Where do we begin?

From scratch. There is no history of previously tried copies (well it’s in some excel sheet). The reviews were across docs, meetings, slack. Experiment analysis was in another doc. The brand and tone needs to be set again.

Psssttt, painful. We just spent months to add a 4 letter word to a notification copy.

So, this week we are asking ourselves —

The History & Versioning

What if we have a string management system that maintains the versioning history of all your copy iterations?
What if it lets you create “templates” of prompts for each product so you can always speak your unique voice?
What if it also allows you to instantly incorporate your experiment insights back to the next iteration?

The Quality

What if it gives you a fairly objective quality evaluation on AI generated copies (eg: performance, factual correctness, brand and safety adherence)?
What if it offers in-built aggregate industry insights for every type of copy and lets you make them more sophisticated over time?
What if it can evaluate translations quality with nuanced cultural context?

The Efficiency

What if it makes it really easy for non-technical folks to discover, edit and play with dynamic parts of a copy?
What if it truly lets each function collaborate effectively? What if it lets different teams share learnings across copy iterations (or differentiate them)?

The Vision

Is it time to build the genetic algorithm for copy evolution?
What if we create a world where every single user sees unique content that is tailored to them?

We are building justwords.ai, in an attempt to answer these questions. We’d love to hear from you, if you’d like to be an early partner, or even if you just want to share an opinion.

Write to us at founders@justwords.ai or find us on linkedin.