How Do Computers Understand Us? A High Level Overview of Programming Languages and How They Work

Aleks
The Startup
Published in
10 min readSep 9, 2020

Many folks have at one point or another asked themselves how a computer really understands what it is that we’re telling it to do. Programmers around the world will type up a storm of obscure-looking syntax on a non-traditional-looking text editor with an oddly designed color scheme, and in some way, shape, or form… ta-dah! An application, a website, a piece of software now exists in the flesh.

But how was some seemingly arbitrary mumbo-jumbo able to effectively and functionally communicate with the computer?

The aim of this article is to answer just that in layman’s terms. I’ve actively chosen to abstain from using overly technical jargon as this piece is intended to read as smoothly as Tennessee whisky (even if you have little-to-no technical experience). This naturally comes at a cost of omitting details and intricacies that exist in this elaborate process, but for the purpose of this article I believe that it is an appropriate trade-off to make.

To start off, let’s have a look at the stack of concepts below:

5. High-Level Languages

4. Low(er)-Level Languages

3. Assembly Code

2. Machine Code

  1. Electricity & Hardware

To begin understanding how we communicate with computers, we’ll explore each one of these components from the bottom up! By the time we reach the top of the stack, the content and verbiage will be in much more familiar territory for most of us who either hear about or deal with modern-day technologies on a regular basis. However, before we get there, let’s take a few minutes to think about what underpins the tech we’ve grown to know and love.

(1/5) Electricity & Hardware:

Let’s really take it from the start. Our exploration will begin from understanding that a computer is built on a series of chips and pins that are configured in a way to recognize whether or not any given piece of foundational hardware has power coming to it or not.

One of these underlying pieces of hardware is what we call a bit.

Through coming up with clever ways to group these bits together, including building them into logic gates and other concepts that are beyond the scope of this article, we essentially create the hardware of a computer.

(If you’re interested in learning more about this, I would highly suggest But How Do It Know by John Clark Scott as your go-to book on basic computer principles. It does an incredible job at explaining the above concept in its full glory, while using terms even a 5-year-old would understand.)

The key takeaway in this section is as follows: any given bit can only exist in two states. These states include either receiving power, or not receiving power. That’s all.

This is fundamentally all that computers do. It is the combination of sending power to the right places at the right time that allows us to have graphical interfaces, memory, and all the other fun stuff we think computers simply “understand” or “remember”. Fundamentally, though, every bit either has power running to it, or it does not — that’s it.

Now the question becomes, what combination of power (and lack of power) do we want to employ? And, moreover, how do we control it and tell the computer what to do with it? That’s where machine code comes in.

(2/5) Machine Code:

Machine code is a (very) long representation of 1’s and 0’s that tells the CPU (central processing unit, a core physical component that keeps the computer ticking) where & when to draw power, and where & when to shut it off.

This code is not typically human-readable! If you tried to open up a text file with machine code on it, it would come out as a load of unprintable characters. However, the key thing to understand here is that this is as close as we get to speaking to the computer hardware — by feeding it appropriate combinations of 1’s and 0’s, we tell it exactly what to do. Have a look at a chunk of machine code here that has been displayed as (readable) 1’s and 0's:

When people say “computers only understand 1’s and 0’s”, they’re fundamentally referring to the concept of machine code. As we saw in the previous section, strictly speaking this isn’t entirely correct as computers really understand “there is power” and “there is no power”. However, as computer scientists we have chosen the unary value 1 to represent the presence of electricity running to a specific node, and 0 to represent the lack thereof.

The choice to use 1’s and 0’s is only somewhat arbitrary, though. Arguably, we could have used A and B, or x and y, but there is a method to the madness in choosing 1 and 0.

The nature of either having power or not having power is a binary relationship — that is, there are two (and only two) possible states, only one of which can be true at any given time. Mathematically we represent the binary number system with only 1 and 0; and given how inherently embedded mathematics is in the computer science realm, 1 and 0 are the only sensible choice that allows us to practically (and mathematically) ensure a congruent and holistic communication system between a computer’s operations and the underlying physics.

So, that’s all fine and dandy, but as I mentioned earlier, the machine code we may try to look at isn’t even human readable, so how do we possibly bridge the gap between what we type on the computer and the 1’s and 0’s?

Enter assembly code.

(3/5) Assembly Code:

Assembly code is different than machine code in that it is (slightly) more writable and readable by humans. Here’s a snippet of some assembly code:

Computers do not understand assembly code. The code itself is a set of instructions telling the computer what to do, but the computer itself can not interpret it as-is.

What this means is that at this point in our multiple levels of abstraction we’ve already reached the point where we do not communicate with the computer directly anymore. Is there a middleman that can help us deal with this?

Yes! That middleman is what’s called an “assembler” .

On a really simple level, when we write our assembly code, we need to run that code through the assembler which effectively “translates” (i.e. assembles) it into machine code.

The instructions we wrote in assembly language are now in machine code, which the computer can understand. Hoorah!

Moreover, this assembly is very efficient in its execution, and (compared to its machine code counterpart) is much more convenient to write in because it uses symbolic representation to tell the processor what to do.

(I realize that it still looks like an alien language to most of us, myself included, but it is a step better that 1’s and 0’s… but don’t worry, we can still do better).

The concept of translating from one language to another is incredibly important. In fact, it is key to how the average programmer has the ability to get the computer to do what we want it to.

All that said, it is still incredibly uncommon in 2020 for developers to get their hands dirty with assembly language. Let’s take another step closer to familiarity with low-level languages.

(4/5) Low(er)-Level Languages:

Technically speaking, machine code and assembly language are what we call Low-Level Languages. We refer to them as such because they are very close to speaking to the hardware of the computer directly.

However, the distinction is not crystal clear between what is considered a low-level language, and a high-level language.

There are many languages that tend somewhat toward the lower-side, such as C++, COBOL, and Fortran which serve to take one further step away from the machine code and assembly language we previously discussed. Here’s a visual of some C++ code, for reference:

For some of you, we may be beginning to step into the realm of the familiar. These languages are sometimes talked about, and they’re definitely still employed by many coders and organizations. Typically speaking, they are even more readable and writable than assembly language, similar to how assembly language was more readable than machine code.

When you write code in a lower-level language, the translation process to assembly language is called “compiling”. Compilers are an incredibly important, but also incredibly complex, topic, so I will leave the details to the godfather of compiling, The Dragon Book.

Although many developers now use high-level languages (covered next), there are still many benefits to using lower-level languages. One of the main ones is definitely fast performance — the closer you are to communicating to the hardware, the more efficient your programs will run.

There are plenty of institutions that haven’t moved on from lower-level technologies because they are unable (and more often unwilling) to do a complete overhaul of their existing systems into less antiquated technologies (I’m looking at you, banks!) There is nothing wrong with low(er)-level languages; in fact, as I mentioned, they are highly efficient.

However, they do tend to be less accessible to those with very little experience, and still aren’t quite as familiar to the average person. For our final step into modern-day coding, let’s take our final step by looking at high-level languages.

(5/5) High-Level Language:

High-level languages, as the name suggests, are the furthest level of abstraction away from directly communicating with the machine (i.e. they are high up). The high-level realm belongs to languages like Python, JavaScript, Swift, Java, Ruby, etc. Although some of these may still be unfamiliar to many of us, they are much more common household terms than x86 Assembly.

Many programmers who take their first steps in coding will begin with learning high-level languages. And quite frankly, they may still stay in this realm throughout their careers.

There is absolutely nothing wrong with this. High level languages have allowed us to create some of the world’s most popular software, games, and applications, across all our favourite devices. However, it is important to understand what the benefits and tradeoffs are of high-level languages.

The biggest pro of high-level languages is accessibility; they are relatively easy to learn by anybody who has the patience to struggle through the initial learning curve. Thankfully, the higher-level a language is, the closer it tends to be to English! A language like Python can almost be read by somebody with little-to-know programming experience because of how similar it is to our everyday vernacular. Have a look here:

The biggest con of high-level languages is that to eventually get down to machine code, it takes quite a few steps of work for our transpilers, compilers, and assemblers. This in turn leads to inefficiencies that may limit what can be done with one programming language, but may be feasible with another.

(The process of translating from a high-level language to a lower-level language is called “transpiling”, which is a subset of “compiling”.)

Colloquially, you may hear people saying that a language like C++ is very efficient, where as one like Python is very inefficient. If you’re building a highly graphically and computationally involved program like a video game or a mapping application, Python won’t cut it — C++ is almost exclusively the way to go.

However, data analysis and other simple programs lend themselves very well to Python, which has proven itself an incredible tool across countless situations.

This does not make one programming language better than another, per se. You may choose to learn one over another depending on your use case, or your general previous coding experience. Learning Python would be really manageable, Java would be a touch harder, and C++ would be even less advisable for brand-new beginners.

In summary, we learned that learning to code is equivalent to learning how to speak a language, and then relying on a series of translations that will break down our syntax into sequentially more obscure symbols, eventually bringing it to 1’s and 0’s, which then direct the computer on which bits to run power to, and which to shut off. If you refer back to that stack at the beginning of the article, and now read it from the top-down, the whole process should make a bit more sense now:

- High level languages get transpiled into lower level languages

- Lower level languages get compiled into assembly language

- Assembly language gets assembled into machine code

- Machine code tells the hardware where to allocate and direct power

- The hardware uses combinations of bits operate the power in order to run the programs and applications what we want it to

I will reiterate though that there is much, much more to this than could be covered in a short article. There are developers and wickedly good computer scientists who actually write the assemblers/compilers we all take for granted, and who help us continue to evolve the field in more ways than we can imagine. To describe their roles and responsibilities is far above my pay grade, but I do hope to be able to speak more to their contributions as I learn about them over the many years to come.

I hope you enjoyed reading this as much as I enjoyed writing it! As a follow-up to this article, I am considering elaborating on it by explaining how to go about writing your own programming language. Feel free to write me / comment below if this is something you’d be interested in reading, or if you have any other questions and comments about the article :)

--

--