The quirks of programming music theory

Nick Rose
Dec 31, 2018 · 12 min read

In this article I write about a way to automate the transposition of musical pitches. A humble task, you might think. Transposition after all is basically the equivalent of adding or subtracting a pitch by an interval. But writing code that negotiates the syntax of music theory fundamentals isn’t so straightforward.⁰ This is due to the inherent asymmetry of Western musical scales and something called enharmonicism — different spellings of pitch classes that represent the same value in set-theory integer notation.

Music theory primer
In music we’ve got what are called pitches. These are things represented in writing by symbols like C♯6, A♭4, and F3. The first character of any pitch is always one of seven possible pitch letters in Western music, ranging from A through G, inclusive. These letters can be inflected by accidentals such as flats () or sharps (). When no accidental is visibly present, a natural () accidental is implied. A pitch letter in the context of music theory is always composed of one or more of these accidentals.¹ For example, B♭♭ is a valid expression, read out loud as “B double-flat.”²

A system excerpt from Hugo Wolf’s String Quartet in D minor — teeming with accidentals, including double-sharps

The combination of a pitch letter and accidental(s) defines a pitch class (henceforth, PC). In post-tonal theory a PC can map onto an integer value — the full gamut of values being 0 to 11, inclusive. A value of 0 typically gets mapped onto by PC C, 1 by C#, 2 by D, and so on. As we’ll see later, however, enharmonicism throws a wrench into the consistency of these one-to-one mappings of key-value pairs.

Successions of PCs with their integer and interval values labelled

Lastly, the integer character at the end of the pitch symbol is the octave the PC is associated with. A full 88-keyed piano ranges roughly seven and a half octaves starting from the zeroth (0). The combination of a PC and an octave defines a pitch. An octave n ranges from pitches Cn to Bn; the next pitch after Bn will therefore be Cn+1. A pitch can be mapped onto a specific frequency, and every successive octave doubles the frequency of the pitch. For example, A3 is 220 Hertz (Hz), A4 is 440 Hz, A5 is 880 Hz, and so on. This can also be expressed as Pn = P0 * 2^n where P is a PC and n is an octave. In code³ you can think of each PC as an abstraction of an array of pitch frequencies. For example, here’s the generation of all eight of the B♭s found on a piano, using the previously mentioned formula, followed by an index lookup of B♭2s frequency:

val bFlat0 = 29.131
val bFlats = (0..8).map { bFlat0 * 2f.pow(it) }
println(bFlats[2]) // 116.579

You might expect A0 to therefore be the lowest possible pitch, but it actually has an unremarkable frequency of 27.5 Hz. The reasoning for labelling it as the zeroth octave is idiomatic to the piano’s supported range.

The bottom (left) side portion of a piano keyboard with frequencies and pitches labelled

A humble task
The goal of this article is to write a program which reliably transposes pitches into a syntactically correct musical output. For example, transposing D4 up a major second (M2) interval yields an E4. Simple, right? Let’s transpose that output up by another M2. This yields F♯4. Suddenly it’s not so simple — where did that come from, and why? Here’s another fun complication: a C♭4 pitch is labelled as having a higher octave than B♯3, yet the latter pitch actually has a higher frequency than the former!

Transposing pitches isn’t straightforward in part due to the asymmetry of the most common musical collections in Western music, the major and minor scales.⁴ A C major scale consists of the following set of PCs: C D E F G A B. Its apparent cohesion is undermined when the collection is mapped onto integer notation: 0 2 4 5 7 9 11 — the intervals between each integer (including the last back around to the first) are 2, 2, 1, 2, 2, 2, 1. Those intervals reveal the collection’s inherent asymmetry. So how can we reconcile this in code? Let’s start with a base set of simple, readable classes to cover the information we’ve gleaned thus far.

enum class PitchLetter(val integerValue: Int) {
A(9), B(11), C(0), D(2), E(4), F(5), G(7)
}
inline class Accidental(val modifier: Int)data class PitchClass(
val pitchLetter: PitchLetter,
val accidental: Accidental
)
inline class Octave(val value: UInt)data class Pitch(
val pitchClass: PitchClass,
val octave: Octave
)

Ideally the outcome of this article will be something that provides a simple API surface for transpositions, such as:

Pitch#transpose(Interval)

What might this proposed Interval class look like? A naive first attempt could have it be this:

// Naive attempt
data class Interval(
val distance: Int
)

The transposition of a PC represented in integer notation can indeed be distilled to a singular number that represents the interval. A PC as a combination of a pitch letter and accidental, however, requires more information. For instance, the intervals from C to D♯, E♭, and F♭♭ are all an integer distance of 3. If the code were to invoke cNatural4.transpose(Interval(3)) how would it know which pitch letter to land on? Theoretically there are infinite spellings of PCs that map onto the same integer value — this is what is meant by the term enharmonicism. The purpose behind the concept is tonal function; for example, a chord spelled C-E-G-B♭ is interpreted differently from one enharmonically spelled as C-E-G-A♯. The former has a tendency to resolve toward an F major chord, and the latter to a B major chord. This is also why the Interval class actually needs two fields to provide all the data necessary for an unambiguous transposition: one to define the letter distance, and one the integer distance. This is completely in line with how musical intervals are traditionally expressed, too — a minor 2nd, major 3rd, perfect 4th, and so on, each imply a respective letter distance.

// Better
data class Interval(
val letterDistance: Int,
val integerDistance: Int
)
cNatural4.transpose(Interval(1, 3)) // D4
cNatural4.transpose(Interval(2, 3)) // E4
cNatural4.transpose(Interval(3, 3)) // F♭♭4

In order to reach the goal of transposing a Pitch, the majority of its composed fields will first need to understand what it means to be transposed. That is, a Pitch can’t transpose without its PitchClass knowing how to transpose, and that can’t be possible without its PitchLetter knowing how to transpose.

From the previous code example we saw that Interval.letterDistance now distinguishes the intended pitch letter output. Transposing in the context of our PitchLetter enum class is simply addition or subtraction of that value within a universe size of the total number of PitchLetters, or 7. The code below articulates this behaviour:

enum class PitchLetter(val integerValue: Int) {
A(9), B(11), C(0), D(2), E(4), F(5), G(7);

fun transpose(interval: Interval): PitchLetter =
with(values()) {
val
modLetterDistance = interval.letterDistance.modulo(size)
val key = (ordinal + modLetterDistance).modulo(size)
return get(key)
}
}

Let’s unpack some of the details seen above. I’ve added a handy Int.modulo(Int) extension function to reconcile the cyclical quality of pitch letters — if on a piano you play a G note, the next white key isn’t going to be H, it’s going to wrap around back to A. Modulo calculation for this is also possible with the % and .rem(Int) operators, but they have the potential to yield negative values which would crash the program when the get(Int) indexing is invoked. This custom Int.module(Int) function always yields a positive return value between 0 and the given universe size argument, exclusive.⁵

Calling this transpose function and getting its result now looks like:

val perfect5th = Interval(4, 7)
val result = PitchLetter.C.transpose(perfect5th) // PitchLetter.G

Handling PitchLetter transposition didn’t provoke a great deal of friction. Determining the right accidental(s) after a PitchClass transposition and the right octave after a Pitch transposition, however, is a bit more challenging.

PitchLetters each have a base integer value that they map onto, e.g. C -> 0 and F -> 5. PitchClass is composed of a PitchLetter and an Accidental, which is just an inlined wrapper for an Int. The accidental field tilts the pitchLetter field’s integer value in either a negative or positive direction. That is, an accidental.modifier value of -1 implies one flat, whereas a value of 2, for example, implies a double-sharp. The two fields’ values added together give the one true integer value for the PitchClass.

Calculating a PC transposition then boils down to applying any accidentals to the output of the PitchLetter#transpose(Interval) call. Let’s look at a non-trivial example: using an Interval(-1, -2) to transpose PC C to B♭. First we’ll invoke transpose(interval) on the PitchClass.pitchLetter field which yields a PitchLetter.B — a PitchLetter with an integer value of 11. Going from PCs C to B is close to what we want, but off by one flat accidental being applied to the B. The PitchClass.accidental.modifier value necessary to yield a PC of B♭ is then going to be -1. But how do we come to determine that modifier value? The integer value of PC Bb is 10, so we take the newly calculated PitchLetter.B integer value of 11 and determine what number is needed to turn that 11 into a 10 — that number will be the transposed PC'sAccidental.modifier value. This is confusing, however, because of the cyclical quality of the PC integer values in a mod12 PC space. Getting from 11 to 10 can actually be achieved in two different directions: by the result of either 11 — 1 or 11 + 11 (counter-clockwise and clockwise, respectively, in the graphic below).

The cyclical mod12 space all PCs reside within

Those two answers will have drastically different effects on the Accidental.modifier value applied to PitchLetter.B; one will be treated as a single flat, and the other as eleven sharps. The latter seems absurd, but a computer doesn’t know that — it has to be given adequate instruction to make the right choice of those two possibilities, and I never had actually specified whether the transposition from C would be down or up to B♭. Thus, the following equation which would work fine in a non-cyclical space, is ambiguous inside of one:

10 = 11 + n  // Is n == -1? Or 11?

The way out of the quandary in the above example is to depend upon the direction of the Interval.letterDistance field. Interval(-1, -2) applied to PC C means the PC is descending by a major 2nd because that field is negative. With this critical piece of information we can tell our program that -1 is in fact the correct accidental modifier to use in this context — but not strictly because that’s the direction that gives accidental.modifier a negative value. Let’s look at another example to see what I mean by that. Interval(-1, 0) applied to C yields B♯. This is a peculiar interval for sure, and likely not one found often in practical music. After performing the first part of PC transposition to get the right PitchLetter, B, it then needs to be determined what accidental to apply to it. The Interval.letterDistance field is descending, but the result we’re looking for is actually one step higher than BB♯. The logic still depends on knowing whether the Interval.letterDistance field is descending or ascending, however. For the sake of not congesting this article with too much code (all of which can be found on my GitHub⁶) I’ll simply outline the algorithm.

  1. Get the directional distance from the PC to be transposed to the new PitchLetter.
  2. Subtract the Interval.integerDistance by that number to get the remaining distance to the desired PC integer value. This will be the Accidental.modifier value of the PitchClass output we want.
  3. Instantiate a new PitchClass using the values of the transposed PitchLetter and the value produced by the previous step.

For the C to B♯ via Interval(-1, 0) example these steps end up looking like:

  1. C down to B is -1.
  2. Interval.integerDistance is 0, so subtracting it by the directional letter distance of -1 is 0 — (-1) which yields 1!
  3. The previous step gave us our correctly defined accidental modifier — a sharp, since the value was positive.

Now that the method of PC transposition has been established, our final step is to transpose a Pitch. Since the work to transpose Pitch.pitchClass has already been done previously all that’s left is tacking on the right octave to it. We’ll calculate this using theInterval.integerDistance field. The way PCs can be described makes this an interesting challenge, however. Earlier in this article I mentioned that a pitch like C♭4 is labelled as having a higher octave than B♯3. This is a counterintuitive because when the former pitch is transposed to the latter it descends in PitchLetter and Pitch.octave both by a value of -1, yet ascends in integer value by 1 — its Hertz frequency value has risen. One would expect that if the Interval.integerDistance were to be positive it would be impossible for the Pitch.octave to be reduced in value, but this is not the case.

Reconciling this can be achieved by ‘normalizing’ the PCs before determining whether there is an increment or decrement in octave value. By this I mean pushing a PC that has an accidental toward the ‘natural’ direction and accordingly augmenting or diminishing the Interval.integerDistance in parallel. Take for example the C♭4 Pitch. To get from it to B♯3 an Interval(-1, 1) is used. By calculating the difference between the former pitches Accidental.modifier and the latter’s and adding it to the Interval.integerDistance, we artificially treat the transposition as if it were happening between two Pitches without accidentals. We can then transpose the origin Pitch by that new value and if it surpasses one of the octave boundary Pitches (that is, dipping below C or rising past B), then we can say for certain that the octave has either incremented or decremented.

Let’s put this concept into practice by applying it to our C♭4 -> B♯3 transposition via Interval(-1, 1):

  1. Get difference of the ‘from’ and ‘to’ PCs’ Accidental.modifiers and add them to the Interval.integerDistance. This looks like -1 — 1 + 1 == -1.
  2. The ‘new’ Interval.integerDistance, based on the previous step, is now -1, and reducing C4s integer value by that makes it goes below the octave boundary Pitch to hit B3. The octave has therefore been reduced by a value of 1.

You might think after reading that Pitch transposition algorithm: why not just consider the Interval.letterDistance instead of all this cumbersome addition and subtraction? It was already -1 after-all, implying that the C PitchLetter would have to reduce to B and thus will have moved to the octave below. In this particular example you’d be right; but Interval can perform multi-octave operations. That is, it can have an Interval.integerDistance value of greater than 12 or less than -12, in which case we’d certainly have to consider this other field of the Integer class when performing transposition. The Integer.letterDistance field is constrained to the mod7 universe of PitchLetters, and therefore wouldn’t work for multi-octave transpositions.

And with that we’ve finally arrived at achieving Pitch transposition. In my GitHub repository for this project I’ve posted all the code to this including a suite of unit tests to support its efficacy.⁶ One great thing about having automated transposition is that it becomes very simple to generate copious amounts of meaningful data. For example, with the succinct block of code below I am able to print out 120 different musical collections. Feel free to clone the repository and try it out!

Pitch(PitchClass(PitchLetter.C), Octave(4u))
.toCollection(Interval.chromaticScale).forEach { pitch ->
println(pitch.toCollection(Interval.majorScale))
println(pitch.toCollection(Interval.naturalMinorScale))
println(pitch.toCollection(Interval.majorPentatonicScale))
println(pitch.toCollection(Interval.minorPentatonicScale))
println(pitch.toCollection(Interval.wholeToneScale))
println(pitch.toCollection(Interval.majorOctatonicScale))
println(pitch.toCollection(Interval.minorOctatonicScale))
println(pitch.toCollection(Interval.circleOfFifths))
println(pitch.toCollection(Interval.majorTriad))
println(pitch.toCollection(Interval.minorTriad))
println()
}

⁰ Music theory isn’t broadly defined as being syntactical, e.g. in Roman numeral analysis determining which pivot chord to use for a modulation is usually subjective. The fundamentals of music theory, however, e.g. scale degrees, intervals, and so on, have an objective syntax.

¹ This applies only to flats and sharps — a pitch letter composed of multiple natural accidentals is no different from one natural accidental or none (which implies one).

² Double-sharps also exist, e.g. C♯♯, but are curiously spelled as C𝄪. Also, sometimes in sheet music you might see a pitch letter with two of the same accidental followed closely by the same pitch letter with a natural and one of the previously seen accidentals, e.g. C𝄪 followed byC♮♯. This is merely a “syntactic sugar” to aid the performer of the work.

³ Here and for the rest of the article I’ll be using the Kotlin programming language.

⁴ From another perspective the major/minor scales do have symmetry. Stacking perfect fifths or fourths will yield these collections. For example, F C G D A E B is a collection of stacked fifths that resulted in theC major collection.

⁵ For its implementation, see: https://github.com/nihk/MusicTheoryQuirks/blob/master/src/main/kotlin/Util.kt

https://github.com/nihk/MusicTheoryQuirks

Nick Rose

Written by

Nick Rose

Composer of code and programmer of music

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade