AI in Music Production: Here’s what I Learned

Hans-Martin ("HM") Will
6 min readNov 4, 2023

--

Over the past few months, I have followed the topic of AI in Music more closely. Primarily motivated by personal interest in electronic music production, it has also been a great vehicle to take a closer look at the underlying generative AI technology while naturally narrowing the scope of literature to review.

I am now reaching my sixth month of this journey, and it feels like a good time to summarize some of my learnings and take-aways. BTW, I am not including all the pointers and references here backing up these points. Instead, you will find a lot of them in my previous blog series:

  • Functional music (“muzak”) as beachhead: We see AI-generated music being first adopted in applications where music is playing a functional, secondary role, rather than being in the forefront. Examples are stock audio music for presentations and commercials, gaming audio, general background music. Interestingly enough, it is not so much the (currently!) lesser artistic quality of AI-generated music that is driving those areas of adoption. Instead, these are applications where one wants to avoid paying royalties, and it is actually desirable to have music that comes across as familiar rather than novel and surprising.
  • We’ll see an oversupply of AI music generators: The more I am following this field, the more providers of AI-based tools for music production and music generators I am seeing. To a certain extent, I believe that is because music is an innate aspect of human nature, and therefore entrepreneurs are more likely to take on the risk that comes with creating a venture in this space. However, the end-state is likely going to be similar to the existing space of music technology, where we have a few larger platform vendors hosting content components (“plugins”) from many, many small vendors.
  • Access to training data is key constraint: Overall, the community is lacking sufficiently large training sets of required depth and breadth to cover sufficiently broad musical spaces. In the language world, development of large language models was enabled by resources such as Common Crawl, which are easily accessible even to small organizations. In addition, the question of what constitutes “fair use” when applied to language fragments tended to favor using crawled content over abstaining from it. In the case of music, the situation is a lot more complex. First, music tends to be locked up by a few large vendors inside pay-walled content platforms. Second, concerns of IP infringement are elevated because musical patterns that an AI may want to learn lend themselves more directly to being the reason an infringement claim.
  • Gap between consumer- and pro-offerings: At present we can observe two major categories of applying AI in music production: 1. End-to-end generation of audio tracks, which requires only limited amount of input (e.g. textual prompts) and that is typically targeting a non-expert user. 2. Tools that integrate into the workflow of professionals, which target specific steps in the overall production process. This is also known as “Intelligent Music Production”. Between those two is an opportunity for AI-enabled composition and production tooling that can be used iteratively and interactively, and that becomes an intelligent partner and accelerator: a copilot for making music.

With these factors at play, here’s a few opportunities for AI-based music technology that we are seeing.

  • Monitoring of IP violations: With concerns about IP so much in the center of the on-going discussion, an immediate opportunity is to use AI technology for audio content analysis to detect violations and opportunities for additional royalty collection. There are already vendors in this space covering some aspects, but there is an opportunity to cover larger parts of the overall value chain from musical motifs to TikTok videos.
  • Providing a framework for fractional licensing and attribution: Coming from a different direction, one can conceive creating a framework that provides proper lineage and attribution of IP across the value chain. For example, for elements created using an AI model, fractional attribution may be given to the originators of the underlying training content to the extent it was applied. Technically, this would require an integration between models and content generation platforms to allow for required watermarking and tracing, for example using distributed ledger technologies.
  • Providing a platform for AI-enabled music production: I’m coming back to the idea of a copilot for music production. In order for such a copilot to provide assistance throughout the creative process from idea to final production, it needs to have the ability to access it across all its constituent elements. For example, it needs to understand melodic, harmonic and rhythmic patterns across voices to understand their interactions and musical development. However, the current generation of Digital Audio Workstations (DAW), while providing plugin interfaces at the level of individual components to be inserted into individual tracks, does not provide such a view to external extensions. We’ve seen isolated workarounds: For example, iZotope’s mixing and mastering plugins use a separate back-channel to enable AI-assisted adjustments across tracks. But that capability is limited to tracks that have one of iZotope’s components present, and it only works for those specific tools by that one vendor. So, I posit that there is an opportunity to create and establish a technology and platform architecture that supersedes the current DAW design going back to the 90-ies (yes, that’s 30 years back).

In parallel, here’s my take on how generative AI will affect and influence the success of human artists. To a certain extent, these are along trajectories that came with the general democratization of music production over the past two decades (“laptop artists”):

  • Standing out means breaking away from the crowd: With the barrier to creating music that sounds “professional” and “familiar” being lowered even further through the use of generative AI, the true opportunity for artists lies in standing out and providing something different than the average and established mainstream. Nothing really new here, but given that generative AI is creating artifacts driven by the center of gravity of statistical distributions — that is averages in the most direct sense — deviating from the common is the most effective way to overcome this competitive threat. That, of course, does not prevent an artist from using AI techniques as productivity enhancing tools.
  • Performer and entertainer versus content generator: In addition, beyond creating content that is unique, it will be even more important to build a profile as performer and entertainer rather than mere content creator. We have reached a point where a single AI platform has over the course of a few months created more content than the overall inventory of one of the largest streaming services. So content in itself will be lost unless augmented by something else. Again, not a new trend, but a trend being exponentially accelerated by AI. Therefore, any form of building direct relationship with the audience as performer, entertainer or social media personality will ultimately shadow the quality of musical content alone.
  • Developing true musical and entertainment skills more important than ever: And this previous point really leads to this last corollary. In order to stand out from the crowd as musician and being able to capture audiences, those underlying skills are the ones that will need to be continuously stretched and developed to stand out from the masses.

Those these points above summarize my key takeaways from the past few months. Obviously, generative AI in music is a hot topic these days, and announcements and news are coming in daily. As such, we can look forward to many, exciting new developments in this space. However, it will be interesting to see to what extents those observations and insights above will hold or be invalidated. I am also looking forward to any comments or feedback you may have.

--

--

Hans-Martin ("HM") Will

Technologist & Product Builder - AI, Data & Spatial Computing