An Intro to Music Theory for Hackers

The velocity of a wave is equal to it’s frequency times it’s wavelength — (v = f * λ) — Frequency is measured in Hertz (Hz) which is the crests of the wave that pass a fixed point per second. Wavelength is measured in meters from crest to crest (or more realistically in nanometers). Velocity is measured in meters per second.

Two things to note here. First, an audible sound wave has a fixed velocity when traveling through air (343 meters per second) this is what you commonly hear called “the speed of sound”. Any sound wave will travel at this velocity, the only way to change the velocity is to change the medium that the wave is traveling through. Second, the amplitude of a wave has no effect on it’s velocity. A sound wave will displace air particles as it travels, a larger amplitude means that more particles will be displaced but the wave still travels at the same speed. Amplitude affects the volume of a sound not the pitch.

Which means for audible sound waves the values that are variable from note to note are the frequencies and the wavelengths, and those change inversely to each other. So for all intents and purposes the only thing that varies from one “note” to another is the frequency of the sound wave it produces. When someone refers to a “note” they are referring to a frequency. When someone refers to a “pitch” they are also referring to a frequency. A “higher pitch” means a higher frequency. The range of human hearing is ~20Hz to 20,000Hz you can test that here if you’re curious what range you’re personally capable of hearing.

There are twelve “notes” used in western music: A, A#, B, C, C#, D, D#, E, F, F#, G, G#

If you’re any kind of sensible hacker this will immediately raise some questions like “why is there no B# or E#?”, “how are there only 12 notes when there are 88 keys on a piano?”, “how exactly does the Latin alphabet connect to physics anyway?” etc…

Back before we had oscilloscopes and other ways to computationally measure wavelengths people still made music. And one of the things they figured out was that you could change the “pitch” of a sound wave by vibrating a string faster or slower. The faster you vibrated a string the “higher” the pitch seemed to be. And as you raised the pitch higher and higher it would eventually reach a point where the sound seemed to match a previous lower sound that it had made. Almost as if the two pitches were different shades of the same color. You’re probably already somewhat aware of this idea, if you sing “do re mi fa so la ti do” there’s something similar about the two “do”s even though you’re at a higher pitch when you sing the second one.

These similar sounds got classified as letters of the alphabet (A, B, C etc..) and you could add a number next to them to differentiate which one was the higher sound. So there could be an A0, A1, A2, A3, A4, A5, A6, A7, A8 etc… where each number was a higher pitch where the sound had that similar quality to the lower sound.

When humans first figured out how to measure wave frequencies in the 1800’s, we discovered that the places where sounds were “similar” were where the frequency was exactly 2x the other one. So the sound classified as A3 turned out to have 2x the frequency as the sound at A2.

That’s the first insight that you need. The “pitch” of a note refers to a frequency (in Hz). A sound wave of frequency 49Hz is called a “G”. A sound wave of frequency 98Hz is also called a “G”.

G0 = 24.50 Hz
G1 = 49.00 Hz
G2 = 98.00 Hz
G3 = 196.00 Hz
G4 = 392.00 Hz
G5 = 783.99 Hz
G6 = 1567.98 Hz
G7 = 3135.96 Hz
G8 = 6271.93 Hz

Things like “sharps” and “flats” are arbitrary constructs. The note we call A4 has a frequency of 440Hz. The note we call B4 has a frequency of 493.88Hz. The note we call A#4 has a frequency of 466.16Hz. Sharps aren’t real things, they’re just a hack because musicians used discrete alphanumeric values to represent continuous numbers and then went back and wanted to add things in between.

You could just as easily use numbers instead of letters and call the notes 1,2,3,4,5,6,7,8,9,10,11,12 and it would probably be slightly more logical.

Because notes are just frequencies, you can think of notes as being of type float.

The array of all_notes is [C, C#, D, D#, E, F, F#, G, G#, A, A#, B]. Each section of a piano will contain these notes in sequence.

The reason it starts at C is because C has the lowest frequency of the named notes. C3 will be lower than C#3 which will be lower than D3 etc…

When you go from a lower C to a higher C, you call this going to a higher octave. Going to a higher octave means that you 2x the frequency. A piano is just a bunch of octaves lined up in sequence. When you take all the notes and go to a higher octave you are mapping over the array and 2xing the frequency of each note. The octave transformation function to increase one octave is { |note| note * 2 }.

An interesting observation to make is that this means the number of notes ever used in western music is fixed and relatively small compared to the number of frequencies available in the range of human hearing. If you wanted to make unorthodox music, you could start by tuning an instrument so that all the 0th octaves used different frequencies, meaning that all successive octaves would be 2x unused frequencies and also unused frequencies. Almost every note you played would be something someone wasn’t used to hearing in songs.

A “scale” is an ordered array of notes that has a fixed size. Most scales are an array of 7 notes (that’s the “do”, “re”, “mi”, “fa”, “so”, “la”, “ti” thing). Some scales are an array of 5 notes (these are called “pentatonic scales”)

When you create a scale, you are first ordering the array of all_notes to begin at a specific note, then you are running a filter function to only take certain notes out of the array. When you say that a scale is an “A scale”, it means that you are sliding the all_notes array to begin at A instead of C. When you say it is an “A major scale” the “major” refers to the filter function you use to take notes out of the scale. An “A major scale” and an “A minor scale” would both slide the all_notes array to begin at “A” but then would use different filter functions to decide which notes to reduce the array down to.

To generate the 7-note “A Major scale”, we would slide the array so that A was at position 0. The array now looks like [A, A#, B, C, C#, D, D#, E, F, F#, G, G#] and filter it into a new array by taking elements [0, 2, 4, 5, 7, 9, 11] so this would be [A, B, C#, D, E, F#, G#].

The minor scale uses a different filter taking elements [0, 2, 3, 5, 7, 8, 10] so this would be [A, B, C, D, E, F, G].

The “major scale” always uses the filter ([0, 2, 4, 5, 7, 9, 11]) and the “minor scale” always uses the filter ([0, 2, 3, 5, 7, 8, 10]). So a “D Major scale” would first slide the all_notes array to start at “D”. The array now looks like [D, D#, E, F, F#, G, G#, A, A#, B, C, C#] and take elements [0, 2, 4, 5, 7, 9, 11] and end up with the scale [D, E, F#, G, A, B, C#].

What’s super interesting is the emotions that humans attach to these various transformations. A simplification you learn when you’re first starting out with music is that major scales sound happy, and minor scales sound sad. Because they’re just mathematical transformations, I’m curious why this is.

A “chord” is when you play the notes at positions [0, 2, 4] of the scale simultaneously. For example an A major chord is when you play A, C#, E simultaneously. This is why there are many ways to play an A major chord on a guitar. Because any combination of A, C#, and E will work.

There is normally an “easiest” way to play a chord. For example the recommended fingering for an A major chord is shown below. You can repeat notes when you play a chord as well. So the A major chord strums 5 strings, but two of them are “A” and two of them are “E” and one of them is “C#” so it still makes the chord.

A major chord on guitar

You can make what are called “7” chords by adding the 7th note of the scale as well so notes [0, 2, 4, 6] (I’m using the 0-index, musicians don’t) so adding the G# to the A, C#, E.

You can change where you’re playing the chord on the guitar to change which additional notes your fingers are close to. The Red Hot Chili Peppers for example play their chords in non-normal ways, it’s really up to the musician.

Another way to look at a scale is instead of just looking at the elements you take from the all_notes array such as [0, 2, 4, 5, 7, 9, 11] for the major scale, you look at the amount of steps forward you take each time. i.e. if you were playing a scale on a piano how many notes would you have to move your hand forward after each note. So playing a “major” scale would be [0, 2, 2, 1, 2, 2, 2] and a “minor” scale would be [0, 2, 1, 2, 2, 1, 2].

** NOTE: For some strange reason musicians tend to use 0.5 as the smallest unit of moving from one note to another. So going from A to B would be moving “1” unit upwards, but going from A to A# would be moving 0.5 units upwards. I don’t like this and normally try to eliminate fractions in my mental, written, and coded data models so in my mind going from A to A# is 1 and going from A to B is 2. **

Thinking about raw distances sometimes makes it a little more obvious how to transform a scale. For example, the major scale moves [0, 2, 2, 1, 2, 2, 2] but you can play what’s called an “augmented 4th” scale by changing it to [0, 2, 2, 2, 1, 2, 2] and playing an F# at position 4 instead of an F. Augmented 4th scales tend to give music a “kooky” sound and is famously used in the Simpsons theme song. It’s interesting to take the common major scales and just tweak a random note by one step just to see what happens. You can make your scale eerier or dreamier just by tweaking one note.

A “key” is the set #{} of notes in a given scale. The difference between a set and an array is that the set is unordered. It’s just the seven (or whatever number of) notes that are in the scale. So when a guitar solo is in the key of “D Major” that just means that it uses only the notes in the set of #{D, E, F#, G, A, B, C#}. For example, the lead guitar part in the chorus of “Despacito” contains 110 notes. They are:

[D, C#, B, F#, F#, F#, B, B, B, A, B, G … G, G, G, B, B, B, C#, D, A … A, A, A, D, D, D, E, E, C# … D, C#, B, F#, F#, F#, B, B, B, A, B, G … G, G, G, B, B, B, C#, D, A … A, A, A, D, D, D, E, E, C#, D, C# … F#, E, F#, E, F#, #, F# … F#, G, G, D, D, G, G, G, A, G, F# … F#, F#, F#, G, F#, E … F#, E, F#, E, F#, E, F# … F#, G, G, D, D, G, G, G, A, G, F# … F#, F#, A, G, F#, E]

All of them are contained in the set of notes for “D Major”. It’s common for the first note to be the root note of the key.

This is useful when you learn to solo for example, because if you want to solo in a key and only half the frets on the guitar are in that key, it makes your decision tree of which note to go to next simpler.

If you’re wondering if you can go out of your key, then the answer is yes. Jazz music for example commonly plays intermediate notes from outside the key. But most modern music stays in key.

Here’s what this is all building to. You can arrange the different major and minor scales by how many sharps are included in the scale. For example A major has 3 sharps in the scale, whereas A minor has 0 sharps in the scale. And you can lay it out into a list, then flex it back into a circle. It looks like this and is commonly called “The Circle of Fifths”.

This is commonly called the “Circle of Fifths”

When you play music, chords that are adjacent to each other in the circle of fifths sound good together. That’s why you’ll see the C-F-Am-G pattern so frequently in modern music, because they’re right next to each other in the circle. If you are playing an F minor chord, you can know that a C minor and an A flat Major will sound good with it, without even needing to hear what they sound like. If you have been following this essay, that should blow your mind.