Programing Etude #1

Quincy
3 min readJan 21, 2023

--

Poets of Sound and Time (Stanford CS 470 / MUS 356)

Description:

This assignment is about making an experiential poem with its own sound track. We use word embeddings (vector representations of ourtext) to synthesize / modulate the audio that comprises the poem’s sound track. More details here: https://ccrma.stanford.edu/wiki/356-winter-2023/etude1

Code and content is in this drive folder (Stanford login required): https://drive.google.com/drive/folders/14mhib2sev-Xu08A3ABd2zV6EFlf6MSP2?usp=share_link

Poem 1: PoemGPT

The role of AI in the generation of creative content is exciting to discuss because it is filled with real world implications that will come into play in the near future. It gets at the core of why we appreciate art in the first place and I think it’s fitting to use ChatGPT to generate a poem reflecting on this. Specifically I ask it to ponder what Shakespeare meant when he wrote “the play’s the thing” in Hamlet. For context, I take it to open a discussion on whether the play (or art as a whole) is appreciated simply for its beauty and craft or as a vehicle to communicate meaning imparted by the artist.

I use the glove-wiki-gigaword-50-tsne-2-filtered.txt word embedding model to generate vectors for the final word in each line. The distance of each keyword from “play,” the 0th dimention, and the length of each keyword are used to modulate filter cutoff frequency, delay wet/dry mix, and the likelihood that extra time will be added to each bar of the beat that plays behind the poem.

<video in drive folder>

Poem 2: planeShifting

The promise of word embeddings is not just to locate similar words proximally but to create a map of the meaning relationship between multiple words. For example, doing the operation <king> - <man> + <woman> should result in the vector <queen>. The second poem asks the user to input four words. The first three define a plane and the fourth specifies a direction. The poem proceeds to translate the plane along the vector from the first to the fourth word. If your inputs are “dogs,” “love,” “running,” and “cows,” you might expect the fully translated plane to span “cows,” “adore,” and “mooing.” The music for this poem is motivated by the idea that we are completing an etude. It is harmonically simple and contains repetitive patterns and note lengths.

<video in drive folder>

Reflection:

Truth be told this felt like an etude in why word embeddings fell out of favor. Try as we might to predict how the model will represent words and analogies, more often than not we’re met with gibberish. Tools aside, it was exciting to be able to control music synthesis with words and I found that freeing myself from techniques that are so explicit led to more fluid and unexpected results.

I’m interested to go the other direction. Can I analyze sound and construct vectors? Could spectral information control where in the vector space I draw words from? Could representing mode and having music modulate from major to minor create meaningful and repeatable differences in text? It would be fun to find out!

--

--