Measuring the subjective — improving the quality of AI-generated music as a PM

Richard Cadman
Aug 18, 2018 · 6 min read

For context, I used to be a Product Manager at Jukedeck, a London-based AI music startup. I probably knew the least about music there.

Jukedeck’s warmup for Boiler Room at Slush Music Conference

Music quality is subjective, so we broke it down into three groups of levers we could pull: composition, arrangement and production.


We brainstormed tests to measure incremental improvements in music quality, and considered a range of variables.


To avoid analysis paralysis we picked the simplest test: present two tracks (one before a change, one after) and ask the user to rate them both on an arbitrary scale.


Using Google Scripts we could quickly run a test at scale. We’d serve up two random tracks (one old, one new) for a user to compare and rate.


We tracked two measures: 1) % of users that preferred the new tracks to the old, and 2) a measure of by how much the quality had improved.

Richard Cadman

Written by

Product Manager @Monzo. Formerly @Jukedeck @Newton_Europe @Cambridge_Uni

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade