What the Hex?
I am going to have to science the bit out of this.
There’s a scene early in the Martian (2015) where Matt Damon’s character Mark Watney, presumed dead by his crewmates and stranded on the red planet after a a freak storm, resuscitates the defunct Pathfinder probe in order to communicate with NASA. While the probe is able to send images back to Earth, the best NASA can do with the probe remotely is change the position of its onboard camera.
Watney realizes that the camera’s 360 degrees of rotation would allow the NASA team to spell out messages to him letter by letter if he can provide them an alphabet to point the probe’s camera at. That solution presents another problem, though. The latin alphabet has 26 characters in English, meaning they will be have to be pretty close together in a circle around the probe and it’s highly likely that he’ll have difficulty discerning which letter the probe is meant to be indicating.
“It can’t be our alphabet. 26 characters plus a question card into 360 gives us 13 degrees of arc. That’s way too narrow. I’d never know what the camera was pointing at… Hexadecimals to the rescue.”
So instead Watney solves the problem by using hexadecimals and an ASCII table to decipher coded messages from NASA, and before you know it he’s able to talk to Earth in more than just yes/no questions.
I’ll admit to only barely understanding what transpired in this scene the first time I saw the film. In a film with some pretty advanced scientific concepts, this scene stuck out to me because it seemed like a pretty simple idea that nevertheless went over my head. Apparently director Ridley Scott had difficulty in depicting this scene because he didn’t full grasp the concept himself.
So let’s start with the word hexadecimal. It’s derived from the greek hex meaning 6, and decimal, which is derived from the latin decimus, meaning tenth. Hexadecimals are a numerical system that uses 16 as its base, instead of 10 as is the case with “normal” decimals.
Now if you were to explain a base10 number system to someone who’d never done arithmetic before, you would first need to explain why it was advantageous to even use a numerical system in the first place. After all, we could just as easily use a simple slash (“\”) to indicate quantities of objects. For example, one fish might be represented as \ fish, two fish as \\ fish, three fish as \\\ fish, and so on. In theory, you could represent any quantity of fish using just one character. In practice, however, this becomes unmanageable once you get into large quantities of fish. Even fifty fish becomes an illegibly long string slashes.
Using a base10 decimal system, we can very easily represent a quantity of fish 1,000 times greater than the above slashes using a tenth of the number of \’s. We’ll have to use a few more symbols than the one simple slash, but the tradeoff in efficiency is worth it. In fact, it’s so efficient, that we’d only need a combination of two of these symbols in order to represent those 50 fish above.
The digits in base10 system are:
0 1 2 3 4 5 6 7 8 9
1 = \
2 = \\
3 = \\\
4 = \\\\
5 = \\\\\
6 = \\\\\ \
7 = \\\\\ \\
8 = \\\\\ \\\
9 = \\\\\ \\\\
So \\\\ fish could be written as 4 fish instead, taking up a fraction of the space. Once we reach a quantity greater than 9, we need to start over with 1 and add a placeholder digit, 0.
10 = \\\\\ \\\\\
Now, to represent a quantity greater than 10, such as 4 greater than 10, we just combine the digits 10 and 4 to make 14
10 = \\\\\ \\\\\
4 = \\\\
14 = \\\\\ \\\\\ \\\\
Once we reach 9 again, we just repeat and increase the first digit by one.
We can follow this pattern all the way up to 99, at which we can begin the pattern again with 10 and an extra 0: 100. So in a decimal system based on 10 every additional 0 added to the right makes the number 10 times greater than the preceding one.
10 * 1 = 10
10 * 10 = 100
10 * 100 = 1000
10 * 1000 = 10000
So why base16?
Obviously a system that uses ten digits makes a lot of sense if you’ve been using one your entire life, and it’s very intuitive given that we’re all born with two sets of five fingers to count on(the word digitus in latin means ‘finger’, after all), and so we’re accustomed to considering multiples of ten in groups of five.
When it comes to data stored on computers, multiples of 5 are not nearly as useful. Fundamentally, a computer circuit can exist in one of two states: off or on, and so all computer code is fundamentally binary (lat. binarius “consisting of two”). The smallest grouping of data possible in a binary system is a bit (a binary digit), which are can be organized into a group of 8 called a byte. Since a single bit can have two possible values (0 or 1), a byte can store 16 possible states.
0 or 1 (2)
0 or 1 (4)
0 or 1 (6)
0 or 1 (8)
0 or 1 (10)
0 or 1 (12)
0 or 1 (14)
0 or 1 (16)
With each of those 8 bits having 2 possible states, there are a total of 256 (2⁸) possible combinations that can be stored in a single byte.
If you were to map those 256 possible values to a decimal system, you would need three digits.
So it’s possible, but like with the example of counting fish with slash marks, it’s not the most efficient system if our basic unit is a single byte. Using a base16 decimal system, we can represent any possible bit combination of a single byte using only 2 hexadecimal digits. We’ll have to use a few more symbols than in the decimal system, but for the purposes of efficiency and scalability and human-readability it’s worth it.
The hexadecimal digits are:
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 = 0
1 = 1
2 = 2
3 = 3
4 = 4
5 = 5
6 = 6
7 = 7
8 = 8
9 = 9
10 = A
11 = B
12 = C
13 = D
14 = E
15 = F
So 10 in decimal could be written as A in hex, saving a digit place. Once we reach a quantity greater than hex F (dec 15), we need to start over with 1 and a placeholder digit, 0.
16 = 10
Now, to represent a quantity greater than 16, such as 4 greater than 16 (20 in decimal), we just combine the hex digits 10 and 4 to make hex 14.
16 = 10
4 = 4Dec Hex
20 = 14
Once we reach hex 1F (31 in decimal), we just repeat and increase the first digit by one.
28 = 1D
29 = 1E
31 = 1F
32 = 20
33 = 21
34 = 22
We can follow this pattern all the way up to hex FF, at which we can begin the pattern again with 10 and an extra 0: hex 100 (256 in decimal). Every additional 0 added to the right makes the number 16 times greater than the preceding one.
10 * 1 = 10 16 * 1 = 16
10 * 10 = 100 16 * 16 = 256
10 * 100 = 1000 16 * 256 = 4096
10 * 1000 = 10000 16 * 4096 = 65536
10 * 1000 + 1 = 10001 16 * 4096 + 1 = 65536
So back on Mars — by using hexadecimals instead of the letters of the alphabet, Matt Damon’s character is able to cut down the number of possible digits he can receive from Earth by 10, and more easily recognize the camera’s position.
The code NASA transmits is derived from an ASCII table (left), in which each hexadecimal value (in red) between 0 and 7F (0–127 in decimal) maps to a single character. To decode the message, all Watney has to do is match the hex values to the letters of the alphabet in the table.
48 4F 57 41 4C 49 56 45
H O W A L I V E
Which is essentially all a computer is doing when it reads and outputs data.
As an added bonus, with all the space afforded by just 7 bits, the ASCII table also includes punctuation (!), so NASA is able to transmit lines of code that allow Watney to connect the Pathfinder’s transmitter to the Mars rover’s more powerful communications software. Hexadecimals to the rescue.
The original ASCII (American Standard Code for Information Interchange) table was developed in the US in the 1960s and originally only required 7 bits to store. Most modern text encoding uses 8 bits but is based upon the original framework, and so the system is still in use to this day. Even if you’re not a programmer you’ve no doubt seen hexadecimals used in your web browser in a url like: http://www.example.com/this%20is%20an%20example
where %20 (hex 20) maps to the [space] character in the ASCII chart, since urls can’t accept empty spaces and other characters.
Another place where you’ll have seen hexadecimals before is when choosing colors in any software that allows for color customization of RGB values.
RGB stands for the primary colors: red, blue, and green. In the early days of computing, three bits (0 or 1) could store the value of one of 8 (3²) possible color combinations, where 000 creates black (no color), 111 gives you white (all colors combined), and the other six colors result from the combinations in between.
By increasing the bit-depth to 8 bytes (24-bit), each R, G, or B color value can store 256 possible levels in a single byte, allowing for 16,777,216 (2²⁴) possible colors. That adds up to eight meaningless digits in decimal10, but every one of those sixteen-plus million color combinations can be described in a human-readable format using just 3 pairs of hexadecimal digits:
Black = #00000
Red = #FF0000
Blue = #00FF00
Green = #0000FF
White = #FFFFFF
It’s customary to abbreviate the number when two digits repeat, so the hex value for dark yellow #FFCC00 can also be written as #FC0.
That’s 16,763,904 in decimal10, for comparison.
Hopefully this post has demystified hexadecimals for you; I know researching this topic further has made computer science seem a lot less overwhelming for me, all inspired by 2015 winner for Best Comedy and/or Musical.