What is programming?

For Rachel, who asked.

cmchen

12 min readApr 8, 2014

Programming is built upon two things: data and code.

Let’s start with data.

Data

In the theater, if you hang a paper moon from the ceiling, it becomes the moon.

A bit of cardboard becomes an impregnable tower. With just these props, an actor becomes a prince — a prince who falls so deeply in love with the moon that he falls from the tower to his death, and the audience sheds a tear.

And it’s not just on stage. If a bit of metal from a mine is forged into a spiky ring, it becomes a crown. Suddenly, whoever wears it can start wars or change the law. In fact, people will kill for it. We tend to think that only children play pretend, but everyone does it, all the time. We are creatures of meaning — and meaning is manufactured. Money is just paper, as they say. It is important to remember that the crown remains a lump of metal. It hasn’t changed much; what has changed is how we perceive it. We’ve all made an unspoken agreement to treat it quite differently than just a lump of metal, and that shared agreement is quite powerful.

In the audience’s mind, the actor IS a prince, and that bit of paper IS the moon. That’s what matters — that one thing transforms into another in our minds. The lie becomes the truth.

If the actor doesn’t trick himself into thinking the paper moon really is the moon, the audience won’t believe it either. But the actor also has to remember not to tear the moon — after all, it’s just a bit of paper.

Programming is much the same. It is an elaborate exercise in imposing meaning, usually onto small electrical charges or flashes of light. A programmer builds elaborate fantasies, but she too has to remember not to tear the moon.

This all may seem a bit fanciful, but it gets at something fundamental about programming. Consider an example: you’re watching Jurassic Park. You know the dinosaur isn’t real, that it’s just CG, but nonetheless your pulse quickens when the T. Rex attacks.

The controls that let us play the movie are just as imaginary. The buttons you press, the mouse cursor you press them with — they don’t exist. They’re as fake as the dinosaurs. We forget that they are just a few lights controlled by the computer, mostly because they work — they let us stop and start the movie. Any lie that works is as good as true. The buttons may not be as thrilling as the dinosaur, but they pull off the same trick of convincing us that they are real.

Everything that happens in a computer is just as fake — or real.

The actor spins paper into a moon, and the programmer spins a reality of her own.

So, how does it work?

Encoding

Image from www.raspberrypi.org under a CC BY-SA license.

Consider Morse code. If you can only send bleeps down a wire, we can agree on a system of dots and dashes to represent letters and numbers. There’s nothing special about dots or dashes — the point is that we can represent letters and numbers many ways, not just with writing (and speech). It’s all quite easy so long as we all agree on how the system of representation works.

Computers do something quite similar. They are very good at storing bits. Bits are just electrical charges — on or off, like a light bulb. Everyone knows that computers use zeros and ones — bits are the zeros and ones. Off is zero and on is one. Just like morse code, dots and dashes.

And just like Morse code, there’s a system for representing numbers using zeroes and ones.

It’s called binary. The details aren’t that complicated, but it’s not that important to know them either. You don’t have to memorize the Morse code table to understand how it works.

Here’s what’s important: just like Morse code, computers have a way to represent numbers. And they’re really good at storing huge amounts of numbers. That becomes important later.

Before we go on, let’s consider the term “code.” Morse code is a code, but not in the sense of a secret code. It’s an encoding. An encoding means a way of representing information. Binary numbers are also an encoding. Actually, if you think about it, our traditional form of writing is also an encoding.

So, computers are good with numbers. What if those numbers represented how bright a color is? Or how much money you have in your bank account? Numbers are surprisingly flexible.

We can represent text (like this essay) as numbers. The system shown above is called ASCII. As the chart shows, in ASCII an “uppercase A” is represented by the number 65 in decimal which is 1000001 in binary.

So, when you type an “uppercase A” on your keyboard, your keyboard sends a 65 to your computer, and your computer stores that 65 somewhere.

There’s nothing special about the number 65 — but in this system it represents an A. Again, it’s just like Morse code — an arbitrary system of representation that we’ve agreed upon.

This is a wonderful example of how you can express nearly anything as numbers. And if we can represent it as a number, we can represent it with bits.

The more numbers you got, the more meaning you can represent. One set of numbers can represent a name. Another set of numbers can represent your home address — we’ve all filled out that form. 140 numbers and you’ve got a tweet.

If you blow up any computer image, you’ll see that it is broken down into little rectangles of color — Pixels. A pixel is usually represented by 3 numbers, one each for how much red, green and blue light. So a picture (or frame of a movie) can be encoded as a bunch of numbers.

Thus: Data

So, that’s data. It’s all just numbers. If you buy a computer with 4GB RAM and a 250GB hard drive, those are measurements of how many numbers it can store. 250GB means 250 billion bytes, and a byte is 8 bits. Each bit, remember, is a single on/off value: zero or one.

Computers are great at storing numbers, so we’ve found ways to use numbers to represent everything we care about. All sorts of complexity can be built out of these simple building blocks.

When you take a bunch of numbers and store them on a computer, we call it a file. Here’s a few examples:

This is a text file that contains a draft of this text. You can see the numbers on the left and the decoded text on the right.

This is an mp3 file that contains “Come Down To Us” by Burial. You can see some metadata — the artist, the song title — decoded on the right.

And this is a PNG or image file which contains a screenshot of this talk.

Learning programming is like going backstage. Behind the curtain, the set is just a facade and the dinosaur is just a mesh of polygons.

That’s all there is to data. Data is just ones and zeros, small electrical charges, upon which we’ve imposed all sorts of complicated meanings. It’s a terribly effective system. It means that if we can just figure out a way to store electrical charges in a machine, we can store poetry too. If we can send electrical charges down a wire, we can send a poem to France. And we can! You can do so much with just numbers!

So that’s data. Now let’s talk about code.

Code

So, a computer is only good at working with numbers. But as we know, that opens a lot of doors. How does it work?

Basically, a computer is a fancy calculator that takes instructions. An instruction might be: “add this number to that number.” It might be: “compare these two numbers — are they the same?” Or it might be something like: “Copy this number from here to there.” Believe it or not, that’s about all there is to it.

The magic happens when you combine these instructions. It’s like writing a recipe: “Take a spoonful of milk, add a pinch of salt, put it in a pan…” The glory of the recipe is that the cook can be pretty stupid — he just has to follow the instructions. Some day, they’ll make a badass cook out of a computer, because although computers are stupid, they’re awfully good at following instructions. A computer can only do a couple of simple things, but it does them very well.

Let’s explore a concrete example. Say you wrote a recipe like this:

Recipe: “Make A Letter Uppercase”.
Step 1. I’ll give you a number, which represents a letter. Remember, if it’s 65 that means A. If it’s 66 it means B. And so on.
Step 2. I want you to give me back that number. But first…
Step 3. Compare that number to 97, ie. “lowercase a”. If they’re the same, change the number to 65, ie. “uppercase A”.
Step 4. If it’s a “lowercase b”, change it to “uppercase B”.
Step 5. If it’s a “lowercase c”, change it to “uppercase C”.
…
Step 28. If it’s a “lowercase z”, change it to “uppercase Z”.
Step 29. If the number is anything else, just leave it unchanged.
Step 30. Now, give me back the number. Thanks, computer. You’re done!

This is code. It takes a letter and makes it uppercase.

Let’s make a few observations. Each “recipe” is made up of a series of steps. Each step is an instruction to the computer. Do this, now do that. The ordering of the steps matters — the computer has to work it’s way through them. Lastly, most instructions are an operation on a number — ie. data. Take this number and change it to that number.

Now, this particular recipe like an incredibly laborious way to convert a letter from lowercase to uppercase. If you were going teaching a child to do the same task, you might say: “take a letter and if it’s lowercase, exchange it with the uppercase version of that letter.”

But remember: as advertised, the computer is stupid. The computer doesn’t know anything about uppercase or lowercase. It doesn’t know anything about English or even about letters — nor do we try to teach it about these things. We just spell out a fool-proof set of steps that will have the right effect. It’s like training a monkey to swap one kind of letter for another. The monkey doesn’t have to know anything about English, but nonetheless you can train it to turn “I love you, paper moon” into “I LOVE YOU, PAPER MOON” which makes for a much better tweet.

Programming requires us to reduce whatever we want to do down into small bite-size bits that the computer can do (like copying a number from here to there) and building up from there. It takes a bit of practice to boil things down to instructions the computer can handle, but it’s worth the trouble. It’s like training a monkey that can follow one million instructions per second and knows a billion other monkeys that will be happy to pitch in. That’s a useful monkey!

One of the things we saw about data is that all sorts of complexity could be built out of very simple building blocks. The same is true of code. For example, now that we have our “make a letter uppercase” recipe, we can make a more complicated recipe.

Recipe: “Make a Bunch of Text Uppercase”.
Step 1. I’ll give you some text (ie. a bunch of numbers, not just one).
Step 2. I want you to give me back the text. But first…
Step 3. For each letter (ie. each number), perform the “make a letter uppercase” recipe on it.
Step 4. Now, give me back the updated text (the numbers). Thanks, computer. You’re done!

Observe that this recipe builds upon the first recipe. Each recipe we make becomes a building block that other recipes can use. And in this second recipe, we can already start to forget that everything is numbers. We are already starting to think it terms of other things, like “text” and “uppercase.” It’s like standing on the shoulders of giants. We start at the lowest level with simple operations on numbers and very quickly arrive at another level entirely where our instructions are things like “if the user touches the cow, play the moo sound.” Underneath the hood, it’s still just numbers, but our mind moves on and forgets that.

This pattern — of building more complex recipes out of simpler ones — is very important. If we have a “check time” recipe and a “make noise” recipe, we can build an alarm clock.

Recipe “Alarm Clock”
Step 1. Check the current time.
Step 2. Compare the current time is the same as the alarm time, make a noise.
Step 3. Otherwise, go back to Step 1 and start over.

This is how programmers work: they decompose the problem they face down into simple bits, then work back towards a solution by assembling bits of code together.

Let’s look at some real code.

Here’s a piece of code from Paper. It says, basically: “If we receive a command to open the settings screen, show the settings screen. Otherwise, do something else…”

One more thing — remember when we were talking about how all data is numbers, and numbers can be used to represent text and colors and pictures and sounds? Imagine another system (sort of like ASCII) that works like this:

1. The number 1 means “add two numbers.”
2. The number 2 means “compare two numbers.”
3. The number 3 means “copy a number from one place to another.” etc.

Yes, these are the instructions that we talked about. Code is just another form of data. Is your mind blown? A bunch of colors is an image, and a bunch of instructions is a program.

That’s code.

Recap

1. The computer is stupid.
2. Everything is data or code.
3. Data is just numbers. Well-organized, complicated sets of numbers.
4. Numbers are just bits.
5. Code is just instructions. Well-organized, complicated sets of instructions.
6. The computer is very good at a narrow set of tasks: copying numbers, comparing numbers, adding numbers, storing numbers, etc.
7. From simple elements we can build complicated, useful systems.

What should I learn next?

Here are some good questions…

1. What is a programming language?
2. What is a variable? What is a function?
3. What is a bug?
4. What is memory? What’s the difference between RAM and disk space?
5. What is an operating system?
6. What is a file format?
7. What is a client? What is a service?
8. What is the internet? What is the web? What’s the difference?
9. Who were Charles Babbage, Ada Lovelace, Alan Turing and John Von Neumann?

That’s it.