From Machine Code To Ruby

A Journey Up Layers Of Abstraction

6 min readFeb 28, 2018

Two programmers working with ENIAC c.1946

Computer
noun
:an electronic device for storing and processing data, typically in binary form, according to instructions given to it in a variable program.

Where do languages come from?

How does a language like Ruby or Javascript come into creation? Where does it start? How does it do the things we want it to do? The high-level languages we are accustomed to working with are actually far removed from the inner-workings of the computer you code on. Starting from very (deceivingly) basic machine code to the multitude of high level languages to choose from, there are a few big steps and layers of abstraction that get us from the lowest level languages to a high-level language like Ruby.

Machine Code and Binary

The bottom level in our layers of abstraction is machine code. Machine Code is commonly confused with binary, but the two are not the same. Technically, binary is an abstraction of the boolean of the “distinct magnetic states on a hard drive”. It is a representation of machine code, but not machine code itself. The earliest programmers used punch cards or magnetic tape to render this code. Machine code is really just a bunch of instructions that can be executed by a computers central processing unit (CPU). Machines only understand the on-off, or true-false logic of this binary system. Using a combination of this boolean logic and a base 2 numerical system, a combination of 0’s and 1’s can represent any number, and from there many other things. For programmers, these 0 and 1’s are the lowest level of programming we need to worry about. In machine code, each instruction performs a very specific task. All of this binary code is, understandably, very difficult to read. For example, the simple phrase “hello world” in binary looks like this :

01001000 01100101 01101100 01101100 01101111 00100000 01010111 01101111 01110010 01101100 01100100

and could have been inputed using a switch board, or something like this:

Imagine writing an entire program in this way? In order to perform more complex tasks, programmers needed to create a way to perform more complex tasks without having to

From here, what is the next layer of abstraction?

Assembly Languages

While early programming was done in this binary logic system with on-off values, it quickly became too difficult to render large programs. To solve this problem, assembly languages were developed.

Assembly Languages are very similar to machine code, and have a 1:1 relationship with them. The difference is that assembly languages use a combination of english characters and numbers to represent machine code. They are considered low-level programming languages that provide very little abstraction from machine code. Assembly languages therefore have the same commands and application as machine code, but allows for a programmer to use words in addition to numbers, and as a result it is more readable. Each machine code and its associated language is unique to a particular CPU. Therefore, assembly languages are not interchangeable amongst different CPUs.

An assembly language gets translated to machine code through an assembler. It takes the syntax of the assembly languages and parses it into the machine code equivalent.

One way in which assembly languages are easier to read and write than pure machine code is the ability to represent the bits and bytes of machine code in different ways. For instance, a common feature of assembly languages is representing binary code in a hexadecimal(or base 16) system, as opposed to binary (base 2) system. To see why this makes it slightly easier to read, lets compare the two systems.

Re-using our hello world example from before

01001000 01100101 01101100 01101100 01101111 00100000 01010111 01101111 01110010 01101100 01100100

is rendered in hexadecimal as:

48656C6C6F20576F726C64

One can imagine how large programs would be a little more manageable in this system. In fact, assembly languages are still used today when the programmer wants to access the computer hardware directly. And because of the one-to-one relationship with machine code, it is much faster to render. However, this system still doesn’t give the average reader much information as to whats going on. Assembly languages are still very labor intensive to write in. So where do we go from here?

Enter high-level languages.

High-level Languages

Starting in the 1950’s, high-level languages started to be developed. These new kinds of languages were even further removed from machine code with a very high degree of abstraction, but were way more readable. One of the earliest of these high-level languages was FORTRAN. FORTRAN (from the words formula translation) was highly adopted and is still used today for intensive numeric and scientific calculations. The original release of FORTRAN contained 32 different statements, some of which, like if statements, would be recognizable to any programmer today. FORTRAN was originally still inputed to the computer by means of a punch card, with different columns representing different statements. Our simple Hello World application now looks a little more appealing. This particular example is from FORTRAN 66, the fourth version on FORTRAN.

C     FORTRAN IV WAS ONE OF THE FIRST PROGRAMMING
C     LANGUAGES TO SUPPORT SOURCE COMMENTS
      WRITE (6,7)
    7 FORMAT(13H HELLO, WORLD)
      STOP
      END

Between 1969 and 1973, a language called C was developed. It was developed for the Unix operating system and remains popular to this day. C has had a big impact on the development of modern languages. Lets take a look at Hello World in C.

#include <stdio.h>
int main()
{
   // printf() displays the string inside quotation
   printf("Hello, World!");
   return 0;
}

So Where Does Ruby Come In?

Ruby as a language was developed in the mid 1990’s by Yukihiro Matsumoto (or Matz), and was developed with programmers in mind. Matz wanted to create a language that programmers would have fun writing in. But what is Ruby written in?

Ruby is written in C. In fact, many modern languages are written in C. Being that is is written in C, both languages share similarities, although the differences between them are still large. Ruby is considered a general purpose programming language, meaning it can write software for many different applications. Finally, lets see what our Hello World looks like rendered in Ruby.

puts "Hello World"

And so…

Sources

Programming language - Wikipedia

The description of a programming language is usually split into the two components of syntax (form) and semantics…

en.wikipedia.org

When someone writes a new programming language, what do they write it IN?

Pretty much any language, though using one suited to working with graphs and other complex data structures will make…

stackoverflow.com

What is Assembly Language? Webopedia Definition

A programming language that is once removed from a computer's machine language. Machine languages consist entirely of…

www.webopedia.com

Punched card - Wikipedia

Basile Bouchon developed the control of a loom by punched holes in paper tape in 1725. The design was improved by his…