On Notation: writing music with Sonic Pi

Gordon Guthrie
9 min readJun 2, 2022

--

One of Joe Armstrong’s tricks when talking about the importance of notation was to invite you to do long division in Roman numerals:

MMLCXVII ÷ XIV

It is not accidental that the numerical world, pre-Arabic (well Indian) numerals was Babylonian, the magical number 60 and her daughters/factors:

LX ÷ XXX = IILX ÷ XX = IIILX ÷ XV= IVLX ÷ XII = VLX ÷ X= VILX ÷ VI= XLX ÷ V= XIILX ÷ IV= XVLX ÷ III = XX

The whole point of the old money (240d to the £, 20 shillings of 12d to the pound with a ten bob note at 120 pence, a crown at 60d, a half-crown at 30d, a florin or two shillings at 24d, a sixpence, a thruppence and tuppeny bit) was to be a division dodger. Pounds themselves were bundled in decimals, fivers, tenners, twenties.

The coins themselves were marked Lsd — for libra, solidii and denarii — the British pound sign £ is just a foncy L.

We are all familiar with musical notation:

Music for Frédéric Chopin’s Prelude, Opus 28, No 7

So what is our notation for music in Sonic Pi?

The basic data structure is a ring that is to say an array that loops on itself and its written like this:

(ring, :c4, :d4, :e4, :f4, :g4, :a4, :b4)

If you can read music you will recognise this C Major in as the treble clef:

C Major scale in the Treble Clef

We use the same notation to represent beats:

kick = (ring, true, false, false, false)
snare = (ring, false, true, false, true)
closed_hi_hat = (ring, true, true, false, true)
open_hi_hat = (ring, false, false, true, false)

One of my many obsessions in coding is with readability — and in my opinion this has a lot to do with alignment (and people hate me for it, hi Sean! hi Timotej!).

Here’s some code from my Pometo implementation (of which more later):

match_types(X,        X)        -> X;                             match_types(_,        variable) -> runtime;                             match_types(variable, _)        -> runtime;                             match_types(number,  boolean)   -> number;                             match_types(boolean, number)    -> number;                             match_types(_X,      _Y)        -> mixed.

The use of monospace fonts allows the imposition of a vertical architectural grid over the bare implementation. Contrast the canonical indentation:

match_types(X, X) -> X;
match_types(_, variable) -> runtime;
match_types(variable, _) -> runtime;
match_types(number, boolean) -> number;
match_types(boolean, number) -> number;
match_types(_X, _Y) -> mixed.

Erlang is a pattern matching language, this is a 5 clause function with the clause being executed being determined by a pattern match against the arguments its called with.

By using vertical alignment you can easily scan this code in two directions:

  • where will this function call go?
  • where did this value come from?

The benefits of this approach are a bit occluded with this example — I had to choose code that wouldn’t wrap around the page in Medium.

A better way is to view the programme architecture without the bother of being able to read the actual code. This is a pixelated and shrunk view of the code — you can clearly see the programme structure without reading any code at all:

Pixelated and shrunk code

(If you are writing a new pretty printer for a language I beseech you in the bowels of Christ to examine its output shrunk and pixelated…)

Applying this approach to our earlier Sonic Pi drum rhythms we get:

kick          = (ring, true,  false, false, false)
snare = (ring, false, true, false, true)
closed_hi_hat = (ring, true, true, false, true)
open_hi_hat = (ring, false, false, true, false)

This suddenly seems familiar, look at the simplest beat makers:

Screenshot of beatmaker

You see the same grid pattern in industrial strength DAWs like Ableton (hi, Brandon!), Vocaloid and Garageband.

When you are live coding music with the Sonic Pi you are manipulating code on the fly to make sounds, and its important to be able to reason about the music you are making on the fly. Alignment is part of this, I can now inspect my rhythms using alignment only, I know what part plays when and what other part is also playing.

I am a big fan of the programming language APL (I am learning it (slowly) by writing my own APL for the BEAM, pometo, from the Esperanto for little apple, little apl).

One of the maxims that APL programmers live by is that you need to reason about the programme, and that consequently a good APL programme can be expressed on a single page of paper:

Example APL programme

APL is at heart a very simple data manipulation language that operates on matrices (and recursively on matrices of matrices) and has a rich tool set of operators which can be combined to do so. It looks as lovely on the keyboard as it does terrifying on the screen, all them symbols:

Dyalog APL keyboard

“That looks disgusting, I wouldn’t programme that, too many damn symbols!” you say, well hoo-boy have I bad news for you, teen music maker:

Table of musical symbols

So in the spirit of APL the Sonic Pi data structures look a bit clunky, a bit hard to reason about when live coding. I want to reduce them. The first thing is to swap numbers for booleans:

kick          = (ring, 1, 0, 0, 0)
snare = (ring, 0, 1, 0, 0)
closed_hi_hat = (ring, 1, 1, 0, 1)
open_hi_hat = (ring, 0, 0, 1, 0)

That’s a bit better.

I am a big fan of Edward Tufte — the king of information design, I have his reprint of Minard’s famous chart of Napoleon’s march on Moscow up in the hall:

Minard’s chart of Napoleon’s march on Moscow

One of Tufte’s maxims is use less ink. So lets look at our new reduced ring representation — where is the excess ink, all those commas. Applying the principles of APL, alignment and Tufte together, lets get the most compact representation that packs the most information onto the page. Lets swap the ring for a function that compiles a string to a ring:

kick          = tabb("1000100010001000100010001000100010001000")
snare = tabb("0100010001000100010001000100010001000100")
closed_hi_hat = tabb("1101110111011101110011011101110111011101")
open_hi_hat = tabb("0010001000100010001100100010001000100010")

I am sticking a whole range of bars together here, and if you look closely you can see I am dropping a cymbal variant in on the 5th bar.

This is going a bit too far, we don’t really have enough ink, lets add some more ink:

kick          = tabb(["1000","1000","1000","1000","1000"])
snare = tabb(["0100","0100","0100","0100","0100"])
closed_hi_hat = tabb(["1101","1101","1101","1101","1100"])
open_hi_hat = tabb(["0010","0010","0010","0010","0011"])

And this is a generalisable approach — the function name is tab for tablature, decorated with a b for boolean. But we can have tabn that takes [0–9] numbers or a hex that returns hexadecimal, etc, etc.

This is pretty good — I can reason about multiple drums over multiple bars on the fly when I am live coding. But we still have too much ink, those pesky quotes and commas.

So far we have seen rhythmic tablatures written our in code, but there is no reason not to write melodic tablatures our either:

bass = tab2(["ceg.", "CEG."], :c4)

In my current working code this returns the notes c e g in the 4th octave then a rest :r and then the same pattern except in the 5th octave.

Sonic Pi works by you writing code in a Ruby dialect which is then passed to the server, lexed, parsed and turned into a runtime which is executable.

It is perfectly possible to intercept our tablatures and implement them directly in the Sonic Pi language. This is a common procedure. Consider this Ruby:

if "Do you like cats?".match(/like/)
puts "Match found!"
end

The bit between the sigils / is actually its own programming language — regular expressions. Normally they are a bit more baroque:

def ip_address?(str)
!!(str =~ /^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$/)
end

Inside each of your cells are trapped ancient bacteria, mitochondria, little factories of goodness that you can’t live without but are basically not-human. So it is with programming languages — most of the ones you use contain regexs, an old primeval programming language that once bubbled freely in the primordial soup of low-powered computing, with its own syntax.

If we decide that musical notation needs to be a first class language in its own right we can then design an embedded syntax for it applying our core principals:

  • notational importance
  • alignment-as-structure
  • less ink
  • as much on a single page as possible

Here’s one syntax:

kick   = /1... 1... 1... 1... 1... 1... 1... 1... 1... 1... 1.../b
snare = /..1. ..1. .1.. .1.. .1.. ..1. ..1. ..1. ..1. ..1. ..1./b
closed = /11.1 11.1 11.1 11.1 11.1 11.1 11.1 11.. 11.1 11.1 11.1/b
open = /..1. ..1. .1.. .1.. .1.. ..1. ..1. ..11 ..1. ..1. ..1./b

The compiler would simply turn the notation into a ring for consumption in the code. If you printed kick you would see:

(ring, true, false, false, false, true, false, false, false, true, false, false, false, true, false, false, false, true, false, false, false, true, false, false, false, true, false, false, false, true, false, false, false, true, false, false, false, true, false, false, false, true, false, false, false)

Here’s another possible syntax:

kick   = /-   |-   |-   |-   |-   |-   |-   |-   |-   |-   |-   |/b
snare = / - | - | - | - | - | - | - | - | - | - | - |/b
closed = /-- -|-- -|-- -|-- -|-- -|-- -|-- -|-- |-- -|-- -|-- -|/b
open = / - | - | - | - | - | - | - | --| - | - | - |/b

The difference, of course, is how easy it is to read. The 8th bar cymbal change is more clearly visible in the 2nd than the 1st example (go back and check).

We can add melodies:

kick   = /1... 1... 1... 1... 1... 1... 1... 1... 1... 1.../b
snare = /..1. ..1. .1.. .1.. .1.. ..1. ..1. ..1. ..1. ..1./b
closed = /11.1 11.1 11.1 11.1 11.1 11.1 11.1 11.1 11.1 11.1/b
open = /..1. ..1. .1.. .1.. .1.. ..1. ..1. ..1. ..1. ..1./b
bass = /C... E... F... A... C... C... C... C... E... C.../:c2
lead = /1111 4444 6666 1111 1111 4444 6666 1111 1111 4444/chords

Where the final line is chords using the rock’n’roll I-IV-vi-I notation.

We can also decorate our melodies with representations that are traditional in musical notation, like durations:

melody = /___ |   ~|
/ccc |e fg|/:c4

where the bars say play a single note of duration 3, and the tilda says slide from the previous note to this one.

Designing a proper notation system is waaaay above my pay grade, I can write lexers and parsers and implementation malarky but designing a notation, no siree. Somebody who understands musical notation and theory wants to work with me, different story.

I am reading Howard Goodall’s book Big Bangs about developments in music. One paragraph really jumped out:

Counterpoint is the trick of playing two or more tunes at once, fitting them together neatly like ACROSS and DOWN clues in a crossword. Making a tune sound nice is quite difficult. Making two sound nice that are happening at exactly the same time is more than twice as difficult, believe you me. Making three or four or — in the case of Mahler or Stravinsky — 10 or 11 work on top of each other is mental Olympics of the most spectacular kind. To achieve this feat, you need to see the notes laid out in front of you like the figures and lines on a mathematical graph.

This para ends with this kicker:

Without notation, making two tunes fit together is like trying to play scrabble without the board or the plastic letters.

Writing a tablature notation to be embedded in Sonic Pi is not a neat coding trick, its a serious intellectual effort and should pay out big benefits.

This kinda is an RFC — a Request For Comment a proposal to write one to be embedded in Sonic Pi, I certainly can’t do it alone, so HMU.

Why I am doing this? For my sins I am live coding Sonic Pi at the Kaotic City Of Atlantis summer party in Berlin in June. Come and say hello.

Kaotic City Of Atlantis party poster

Also follow me on Twitter.

--

--

Gordon Guthrie

Former SNP Parliamentary Candidate — Quondam Computer Boffin