# What the Hex?

## I am going to have to science the bit out of this.

There’s a scene early in the Martian (2015) where Matt Damon’s character Mark Watney, presumed dead by his crewmates and stranded on the red planet after a a freak storm, resuscitates the defunct *Pathfinder *probe in order to communicate with NASA. While the probe is able to send images back to Earth, the best NASA can do with the probe remotely is change the position of its onboard camera.

Watney realizes that the camera’s 360 degrees of rotation would allow the NASA team to spell out messages to him letter by letter if he can provide them an alphabet to point the probe’s camera at. That solution presents another problem, though. The latin alphabet has 26 characters in English, meaning they will be have to be pretty close together in a circle around the probe and it’s highly likely that he’ll have difficulty discerning which letter the probe is meant to be indicating.

“It can’t be our alphabet. 26 characters plus a question card into 360 gives us 13 degrees of arc. That’s way too narrow. I’d never know what the camera was pointing at…

Hexadecimals to the rescue.”

So instead Watney solves the problem by using hexadecimals and an ASCII table to decipher coded messages from NASA, and before you know it he’s able to talk to Earth in more than just yes/no questions.

# …wait, what?

I’ll admit to only barely understanding what transpired in this scene the first time I saw the film. In a film with some pretty advanced scientific concepts, this scene stuck out to me because it seemed like a pretty simple idea that nevertheless went over my head. Apparently director Ridley Scott had difficulty in depicting this scene because he didn’t full grasp the concept himself.

I’m learning programming right now, and I’ve realized that even if higher-level languages like Ruby or Javascript can be tricky to master, they are still pretty far removed from the basic machine code that powers computers based purely on *1*’s and 0’s. That low level of programming has always intimidated me, and was probably the main reason I had little interest in computer science as a career field half a lifetime ago, before programming became a more accessible profession. But since I started down the path to being a developer 6 weeks ago, I’ve learned that you can break down any complex action/concept to a manageable size if you just step through it piece by piece.

So let’s start with the word *hexadecimal. *It’s derived from the greek *hex *meaning 6, and *decimal, *which is derived from the latin *decimus*, meaning tenth. Hexadecimals are a numerical system that uses 16 as its base, instead of 10 as is the case with “normal” decimals.

Now if you were to explain a base10 number system to someone who’d never done arithmetic before, you would first need to explain why it was advantageous to even use a numerical system in the first place. After all, we could just as easily use a simple slash (“\”) to indicate quantities of objects. For example, one fish might be represented as \ *fish*, two fish as \\ *fish, *three fish* *as* \\\ fish,* and so on. In theory, you could represent any quantity of fish using just one character. In practice, however, this becomes unmanageable once you get into large quantities of fish. Even fifty fish becomes an illegibly long string slashes.

`\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ fish`

Using a base10 decimal system, we can very easily represent a quantity of fish 1,000 times greater than the above slashes using a tenth of the number of \’s. We’ll have to use a few more symbols than the one simple slash, but the tradeoff in efficiency is worth it. In fact, it’s so efficient, that we’d only need a combination of two of these symbols in order to represent those 50 fish above.

The digits in base10 system are:

`0 1 2 3 4 5 6 7 8 9`

where

`0 = `

1 = \

2 = \\

3 = \\\

4 = \\\\

5 = \\\\\

6 = \\\\\ \

7 = \\\\\ \\

8 = \\\\\ \\\

9 = \\\\\ \\\\

So *\\\\ fish* could be written as *4 fish *instead, taking up a fraction of the space*. *Once we reach a quantity greater than *9*, we need to start over with *1* and add a placeholder digit, *0*.

`10 = \\\\\ \\\\\`

Now, to represent a quantity greater than 10, such as 4 greater than 10, we just combine the digits 10 and 4 to make 14

`10 = \\\\\ \\\\\`

4 = \\\\

14 = \\\\\ \\\\\ \\\\

Once we reach *9* again, we just repeat and increase the first digit by one.

`...`

17

18

19

20

21

...

We can follow this pattern all the way up to *99*, at which we can begin the pattern again with *10* and an extra *0*:* 100. *So in a decimal system based on 10 every additional *0* added to the right makes the number *10* times greater than the preceding one.

`10 * 1 = 10`

10 * 10 = 100

10 * 100 = 1000

10 * 1000 = 10000

...

# So why base16?

Obviously a system that uses ten digits makes a lot of sense if you’ve been using one your entire life, and it’s very intuitive given that we’re all born with two sets of five fingers to count on(the word *digitus* in latin means ‘finger’, after all), and so we’re accustomed to considering multiples of ten in groups of five.

When it comes to data stored on computers, multiples of 5 are not nearly as useful. Fundamentally, a computer circuit can exist in one of two states: off or on, and so all computer code is fundamentally *binary* (*lat. binarius *“consisting of two”). The smallest grouping of data possible in a binary system is a bit (a *binary digit), *which are can be organized into a group of 8 called a *byte. *Since a single bit can have two possible values (0 or 1), a byte can store **16** possible states.

`0 or 1 (2)`

0 or 1 (4)

0 or 1 (6)

0 or 1 (8)

0 or 1 (10)

0 or 1 (12)

0 or 1 (14)

0 or 1 (16)

With each of those 8 bits having 2 possible states, there are a total of 256 (2⁸) possible combinations that can be stored in a single byte.

If you were to map those 256 possible values to a decimal system, you would need three digits.

`Binary Decimal`

00000000 000

00000001 001

00000010 002

00000011 003

....

01100011 009

....

11111111 255

So it’s possible, but like with the example of counting fish with slash marks, it’s not the most efficient system if our basic unit is a single byte. Using a base16 decimal system, we can represent any possible bit combination of a single byte using only 2 hexadecimal digits. We’ll have to use a few more symbols than in the decimal system, but for the purposes of efficiency and scalability and human-readability it’s worth it.

The **hexadecimal** digits are:

`0 1 2 3 4 5 6 7 8 9 A B C D E F`

where

`Dec Hex`

0 = 0

1 = 1

2 = 2

3 = 3

4 = 4

5 = 5

6 = 6

7 = 7

8 = 8

9 = 9

10 = A

11 = B

12 = C

13 = D

14 = E

15 = F

So *10* in **decimal** could be written as *A* in **hex**, saving a digit place*. *Once we reach a quantity greater than **hex** *F* (**dec** 15), we need to start over with *1* and a placeholder digit, *0*.

`Dec Hex`

16 = 10

Now, to represent a quantity greater than 16, such as 4 greater than 16 (20 in decimal), we just combine the hex digits *10* and *4* to make **hex 14.**

Dec Hex

16 = 10

4 = 4Dec Hex

20 = 14

Once we reach hex *1F (*31 in decimal*)*, we just repeat and increase the first digit by one.

`Dec Hex`

...

28 = 1D

29 = 1E

31 = 1F

32 = 20

33 = 21

34 = 22

...

We can follow this pattern all the way up to **hex** *FF*, at which we can begin the pattern again with *10* and an extra *0*:* ***hex*** 100 *(256* *in *decimal*)*. *Every additional *0* added to the right makes the number *16* times greater than the preceding one.

`Hex Dec`

10 * 1 = 10 16 * 1 = 16

10 * 10 = 100 16 * 16 = 256

10 * 100 = 1000 16 * 256 = 4096

10 * 1000 = 10000 16 * 4096 = 65536

10 * 1000 + 1 = 10001 16 * 4096 + 1 = 65536

...

# #bringhimhome

So back on Mars — by using hexadecimals instead of the letters of the alphabet, Matt Damon’s character is able to cut down the number of possible digits he can receive from Earth by 10, and more easily recognize the camera’s position.

The code NASA transmits is derived from an ASCII table (left), in which each hexadecimal value (in red) between 0 and 7F (0–127 in decimal) maps to a single character. To decode the message, all Watney has to do is match the hex values to the letters of the alphabet in the table.

`48 4F 57 41 4C 49 56 45`

H O W A L I V E

Which is essentially all a computer is doing when it reads and outputs data.

As an added bonus, with all the space afforded by just 7 bits, the ASCII table also includes punctuation (!), so NASA is able to transmit lines of code that allow Watney to connect the Pathfinder’s transmitter to the Mars rover’s more powerful communications software. Hexadecimals to the rescue.

The original ASCII (American Standard Code for Information Interchange)** **table was developed in the US in the 1960s and originally only required 7 bits to store. Most modern text encoding uses 8 bits but is based upon the original framework, and so the system is still in use to this day. Even if you’re not a programmer you’ve no doubt seen hexadecimals used in your web browser in a url like: http://www.example.com/this%20is%20an%20example

where *%20 (*hex 20) maps to the [space] character in the ASCII chart, since urls can’t accept empty spaces and other characters.

Another place where you’ll have seen hexadecimals before is when choosing colors in any software that allows for color customization of *RGB* values.

RGB stands for the primary colors: red, blue, and green. In the early days of computing, three bits (0 or 1) could store the value of one of **8** (3²) possible color combinations, where 000 creates black (no color), 111 gives you white (all colors combined), and the other six colors result from the combinations in between.

By increasing the bit-depth to 8 bytes (*24-bit*), each R, G, or B color value can store 256 possible levels in a single byte, allowing for **16,777,216 **(2²⁴) possible colors. That adds up to eight meaningless digits in decimal10, but every one of those sixteen-plus million color combinations can be described in a human-readable format using just 3 pairs of hexadecimal digits:

Black = #00000

Red = #FF0000

Blue = #00FF00

Green = #0000FF

White = #FFFFFF

It’s customary to abbreviate the number when two digits repeat, so the hex value for dark yellow #FFCC00 can also be written as #FC0.

That’s 16,763,904 in decimal10, for comparison.

Hopefully this post has demystified hexadecimals for you; I know researching this topic further has made computer science seem a lot less overwhelming for me, all inspired by 2015 winner for Best Comedy and/or Musical.