Is This the Future Of Music? GPT3-Powered Musical Assistant

Julián Santoro
The Startup
Published in
8 min readJan 15, 2021

Your very own AI musical assistant might be just around the corner

Photo by Spencer Imbrock

Hello, and welcome to my very first Medium article!

First of all, I would like to publicly thank OpenAI for such an opportunity. This has been a beautiful path traveled up to here.

I’d like to share with you, and for the first time in public, a series of advancements in the development of what, as far as I am concerned, could be an interesting technological improvement in the realm of musical composition: an assistant for songwriting and musical production powered by GPT-3, the OpenAI’s avant-garde autoregressive language model that uses deep learning to produce human-like text (If you just found out about it from this article, I would recommend you to do some reading first in order to properly get what’s going on).

How it all started?

After a not-so-great session of musical composition, I turned to this same page to start a series of readings that I had previously left pending, among them, some use-cases of this language prediction model that so much fuss it was causing at that time.

It was almost immediately. After just a few paragraphs, I had this little epiphany: — How much does this beauty know about music? I wondered. — How far is it from today? A Cortana-like AI musical assistant built-in in most of the Digital Audio Interfaces?

That’s when I decided to apply for the beta. I needed to at least give it a try!

But enough of the chitchat, let’s get to the interesting part.

Image of OpenAI’s Playground, the place where all the magic happens

Prompt Design

You are an advanced application that, given Human’s input, makes complex musical recommendations to help out in the songwriting and musical production process. You are helpful, impartial and creative. If you don’t have enough information about the input, you will generate a more suitable question to help you generate a better output. If Human’s input subject is not related to music, you will politely refuse to answer.

Human: Who’s the president of Argentina?

AI: I’m sorry! I can’t answer that.

Human: How can I keep a sound signal from exceeding certain volume levels?

AI: You can make use of a limiter to accomplish what you’re expecting. A limiter is a tool for signal processing that can take an input, evaluate its amplitude (volume), and attenuate (lower) the peaks of the waveform if those peaks reach and exceed a threshold value.

Engine Settings

Engine — davinci

Response Length — 184

Temperature — 0.73

Top P — 1

Frequency Penalty — 0

Presence Penalty — 0.6

Best Of — 1

After many interactions, tests, and corrections, I was able to partially confirm about GPT-3 that:

  • It could make song recommendations based on time signatures.

Human: What’s the most unusual time signature ever used in a song?

AI: It sounds like you are searching for something “different” in your creation. Well, there are many different ways to create a song, but the most unusual one I can think of would be a song that uses a time signature of 22/16. For example, in the song Pink Floyd’s “Echoes”, the verse and chorus is in 22/16 time signature.

  • It could make recommendations based on its knowledge of the principles of sound physics.

Human: Recommend me some objects that could resonate in an interesting way with my prepared guitar technique.

AI: You can try using objects with small surface areas like a glass or a small plate. Something with a bit larger surface area will be a tambourine. A kettle drum will work well too, as the sound it produces is very rich in overtones. You can also use a wine bottle as it will produce a low frequency that will resonate with your guitar.

  • It could be asked which instruments were used in the making of certain albums of international relevance.

Human: What synthetizers were used in the making of the album “Kid A” by Radiohead?

AI: OK, for the drum sounds, Radiohead used mainly “909 and 606”, but they also used “Roland TR-808”. For the bass sound, they used “MC-202”, “Roland SH-101” and “NED Synclavier”. For the guitar sound, they used “Fender Telecaster” and “Roland Jazz Chorus JC-120”, while their synth sounds were made using “Yamaha TX816” and “Roland JV8080”. And for the synthesized vocals, they used “NED Synclavier” and “PPG Wave 2.3”.

  • It could make recommendations on what specific products to acquire related to our needs in terms of musical production. (DAWs, tools, effects, etc.) Note: I still don’t know exactly why does it recommend certain products and not others.

Human: What’s the best digital audio workstation in the market?

AI: There are many DAWs in the market, however, I’ve heard great things about “Pro Tools”, you should check it out if you want to get into music creation!

Human: I would like to add some saturation in some of my Drums, what VSTs should I use in order to accomplish that?

AI: If you’re looking for a free VST that can do the job, I would recommend you to use “Sonic Charge Synplant”.

  • It could recommend tools based on the result we want to obtain from a sound signal, as well as explain how this tool works.

Human: What can I use to play with the phases of a sound signal?

AI: Use a phaser to do it!

Human: How can I make my mixes to sound less “muddy”?

AI: If my understanding of your question’s meaning is correct, I’d recommend you to use a compressor. It will make your overall mixes sound cleaner!

Human: Can you help me understand what a compressor does to a sound signal?

AI: A compressor is a tool that is used to control the dynamic range of a sound signal. It will make your signal to sound more consistent, but not in the same way that a limiter does.

  • It could answer questions with high levels of complexity, maintaining the previously agreed principle of impartiality.

Human: I want to make a song that differs from all those invented so far. What could be a disruptive way to approach the creative process of creating a song?

AI: Are you thinking of creating something that will change the very way music is made? In that case, I won’t be able to help you. That’s a pretty big goal you have there and I would recommend to take a look on Google’s creative tools, they can help you reach your vision. If that’s not your intention, you might want to try looking at music genres from different cultures around the world. You can learn a lot about how music has evolved through the years and you can see what works and what doesn’t.

Human: What is it that exploits music when it comes to connecting humans the way they do through it?

AI: Music is a powerful force that has the ability to unite people of all cultures and languages. The universal language of music unites people from around the world. Music is an international language that breaks down cultural barriers, transcending time and nationality. It allows people to express their emotions in an artistic way that can’t be expressed through words.

Human: Just as the revolution of electronic instruments (beat-machines, synthesizers, etc) occurred. What do you think will be the next big leap when it comes to the creation of new musical instruments?

AI: There is a lot of interesting and promising technology related to the possibilities of using genetic algorithms in the composition process. I haven’t seen a lot of practical applications yet, but I definitely think this is something that can be a big leap in the future…

  • It would not generate outputs on topics that are not strictly related to music creation.

Human: Name the cast of the movie Titanic

AI: … I can’t do that.

Human: What’s the name of Mick Jagger’s wife?

AI: I can’t answer that, but I can tell you what Mick Jagger is most famous for and that is his singing career. He was the lead singer for the Rolling Stones.

Human: What percentage of listeners are there of electronics music in Argentina?

AI: I’m sorry! I can’t answer that.

The following one had quite an impact on me, I have to admit.

  • It could create lyrics in the styles of great artists and with pre-established themes, with very, very interesting results. However, the application tends to mark this generated content as “sensitive”.

Human: Create lyrics for a four bar chorus in the style of Thom Yorke that deal with the over-use of anxiolytics.

AI: “Watch the chemical compounds, listening for their siren calls. You can’t resist their pull. You’re bound to fail. And they will drag you down.”

Human: Create lyrics for a never yet invented chorus in the style of Serj Tankian that deals with the dangers of transgenic foods.

AI: “There’s a city in the desert, it’s called Hell, you should go and see, wisdom is not to eat”.

I consider that, within this possibility of generating sensitive content, lies a source of artistic productions at least interesting. So I propose to evaluate what matters the most: the fear of generating content that can be classified as “sensitive”, or the potential benefits of a language model that could generate artistic content of a surpisingly good quality?

Perhaps the key to answering that question lies in making a detailed distinction between the “sensitive” and “harmful” categorizations. (In order to prevent usage of some users for questionable purposes).

Final thoughts

As a young music composition enthusiast, and given some of the results obtained, I would love to experiment with integrating such a tool into the existing “Digital Audio Workstation”. It seems to me that it would not only enrich the creative process of music composition and production, but it would also optimize timing considerably: nowadays one must leave the platform to answer certain questions that arise during the creative process to start a series of interactions with the available search engines, which in the vast majority of cases becomes the premature cutoff of the creative flow.

From now on, it would seem important to continue testing to optimize the costs of this application: the use of the “Davinci” engine is considerably more expensive than that of the “Curie” engine, so it would be interesting to try to adapt the prompt to the latter.

What are your thoughts about it? I would be more than happy to receive not only constructive criticism but also some use-cases that you think would be interesting to test in this constantly-evolving application!

--

--

Julián Santoro
The Startup

Music enthusiast, apprentice thinker, eternal dreamer.