XKCD: HackMIT 2016 Puzzle

18 min readJul 28, 2016

Last week while browsing Hackers News, I stumbled upon a link to this years HackMIT puzzle and saw that the theme was inspired by the web comic XKCD. As a fan of both XKCD and puzzles, I immediately started cracking away at it. Further in this write up I will present each puzzle and its solutions. If you want to give the puzzle a go for yourself without spoilers, don’t read any further.

For those of you that are not familiar with HackMIT: It is a hackathon and solving the puzzles gives you a chance for a slot to register and attend. — I’m not a student myself, I just enjoy puzzles. —

HackMIT is MIT’s headline hackathon, with over 1000 undergraduate attendees from around the globe. Over a 24-hour period from September 17th to 18th, hackers collaborate and experiment on software and hardware projects. This is the weekend to meet other enthusiastic hackers, push your boundaries, and realize the projects of your dreams!

Puzzle submission

To start the puzzle one had to first find the URL: haxkcd.com where you got presented with a JavaScript Unix looking terminal prompt. After some digging around in the source files, it was clear what commands it responded too and why “Segmentation fault (core dumped)” kept happening.

Using the terminal was very straightforward and > $ help gave all the information needed to list, start and submit the puzzles.

haxkcd submission terminal @ haxkcd.com/

Puzzle-0: Think Fast!

The first puzzle in the game directed to a page displaying one out of six clocks shown below. Each one showing what looked like a location, a date and a time. A refresh of the page would show one out of the six clocks at random and hovering the mouse over any one of the images would show the text: “Think Fast. It all adds up in the end.”

Examining the source didn’t give much more to go on other than the same title text.

<html>
    <head>
        <link rel="stylesheet" href="../static/style.css" />
        <title>Think Fast.</title>
    </head>
    <body>
        <div title="Think Fast. It all adds up in the end." id="clock">
            <img src="../static/xkcdclock.png" width="100%">
            <div id="pmdot" class="False">PM</div>
            <div id="time">10:03</div>
            <div id="date">08/01/2007</div>
        </div>
        <div id="loc"><img src="../static/pinpoint.png" height="50px"> Honolulu, HI</div>
    </body>
</html>

A quick google search for xkcd comics referencing the words in the title, showed a comic strip depicting the clock in the picture and the words: Think Fast.

Factoring the times in 24h format resulted in two prime numbers, adding them all together gave the sum of 426 which is a reference to this xkcd comic: https://xkcd.com/426/ about geohashing. A link found in the bottom of the comic showed a sample implementation to geohashing and presented a map lookup over Boston Massachusetts, which happens to be one of the locations in the pictures.

I did however miss the sample implementation link at the bottom when I first solved this and didn’t notice it until after I had generated my own geohash shown below.

Dow: 16738.08
lat: 42.3751000 
lon: -71.1056100md5 hash: (echo -n 2015-09-17-16738.08 | md5sum): 32700bda88232d0a990a264703a5a11132700bda88232d0a
hex -> dec: .19702219090698908990a264703a5a111
hex -> dec: .5978111194014191Result:
lat: 42.197022
lon: -71.597811

The answer to the puzzle being the coordinates in latitude and longitude for one of the locations in the pictures.

Puzzle-1: Bobby Tables

Puzzle-1 was a page that claimed to be the homepage for an elementary school. The homepage showed a few links regarding the school, most of them didn’t work, but the ones of interest being Login/Register: allowing you to login as a student, register; or login as admin, and Fun Math Puzzles: redirecting to Khan Academy and a course about cryptography.

There were also a parent information notice that read:

“Please manually remove any symbols or quotes from your child’s name as it may interfere with the functionality of the site”.

After some further poking around, it turned out that student accounts could be created, but not activated without admin access. The page itself seemed to run on Node.js with MongoDB. This information, together with the parent notice mentioned above, made it safe to assume that it was vulnerable to some form of MongoDB injection.

ValidationError: Path `username` is required.
    at MongooseError.ValidationError (/home/ubuntu/hackmit-puzzle-2016/bobby/node_modules/mongoose/lib/error/validation.js:22:11)
    at model.Document.invalidate (/home/ubuntu/hackmit-puzzle-2016/bobby/node_modules/mongoose/lib/document.js:1416:32)
    at EmbeddedDocument.invalidate (/home/ubuntu/hackmit-puzzle-2016/bobby/node_modules/mongoose/lib/types/embedded.js:191:19)
    at /home/ubuntu/hackmit-puzzle-2016/bobby/node_modules/mongoose/lib/document.js:1292:17
    at validate (/home/ubuntu/hackmit-puzzle-2016/bobby/node_modules/mongoose/lib/schematype.js:701:7)
    at /home/ubuntu/hackmit-puzzle-2016/bobby/node_modules/mongoose/lib/schematype.js:738:9
    at Array.forEach (native)
    at SchemaString.SchemaType.doValidate (/home/ubuntu/hackmit-puzzle-2016/bobby/node_modules/mongoose/lib/schematype.js:706:19)
    at /home/ubuntu/hackmit-puzzle-2016/bobby/node_modules/mongoose/lib/document.js:1290:9
    at process._tickCallback (node.js:415:13)

There were also a xkcd comic strip that seemed to fit the picture for this puzzle, called “Exploits of a Mom”.

One small injection script to the admin login and voila! the whole database got dumped in the response object.

var xhttp = new XMLHttpRequest();
var payload = JSON.stringify({
 "username": {"$gt": ""},
 "password": {"$gt": ""}
});xhttp.open("POST", "https://xkcde.com/admin/login", false);
xhttp.setRequestHeader("Content-type", "application/json");
xhttp.send(payload);

Database dump:

Admin PanelStudentsUsername: "test"
Password: 190a171100150005180cUsername: admin
Password: 0c0b090c4e46120800010b...

The resulting db dump only showed the students accounts that I had tried creating before (using the student registration tool) and each user accounts password hash. Since this was the only new information to go on and the fact that the Khan Academy link referenced cryptography. I made a new assumption that the key to the puzzle was in the hash encryption method.

First I generated some usable hash tables that would help make sense of what kind of hash we were dealing with. After a quick inspection it was clear that the hash were in hexadecimal and that the same length passwords (regardless of the actual order of the characters in the password) would generate a hash where the last 12 characters were always the same (very salty). After some further inspection I noticed that any password longer than 12 characters in length, would cut off and just write the remaining password characters in its plain ASCII hexadecimal representation.

Username: aaaaaaaaaaaaaaaa
hash: 0c0e0504410700080d141304 61 61 61 61 2073616c7479
 
Username: bbbbbbbbbbbbbbbb
hash: 0f0d06074204030b0e171007 62 62 62 62 2073616c7479Username: cccccccccccccccc
hash:  0e0c07064305020a0f161106 63 63 63 63 2073616c7479var dec = parseInt('0x63', 16); // hex: 63
var ascii = String.fromCharCode(dec);
console.log(ascii); //output "c"

Generating a few different hash tables showed that the same letters would produce the same hex symbol hash in the same position every time. From this I could conclude that the encryption used was a cipher and that a key word was most likely used to encrypt the password. The encryption method seemed to be a XOR cipher, where each char from the key word is converted to binary and using a XOR logical operation that returns true only when inputs differ. The method can be looked as that each letter of the shift(key) word is subtracted from the ascii value, for the same letter position in the shift word, until the end of the password length. Two equal letters will give a difference of zero.

My full generated hash table showed a larger picture and looking closer at any column containing the hexadecimal zeros revealed the shift word used to encrypt the password. (Column five is not showing anything for char ‘space’ [dec: 32] because it became obvious at that point).

CONVERSION:'space': 97['a'] - 32['space'] + 0 = 65 >> 0x41
'space': 98['b'] - 32['space'] + 0 = 66 >> 0x42
...HASH TABLE:a[97]:  0c 0e 05 04 41 07 00 08 0d 14 13 04 61 61
b[98]:  0f 0d 06 07 42 04 03 0b 0e 17 10 07 62 62
c[99]:  0e 0c 07 06 43 05 02 0a 0f 16 11 06 63 63
d[100]: 09 0b 00 01 44 02 05 0d 08 11 16 01 64 64
e[101]: 08 0a 01 00 45 03 04 0c 09 10 17 00 65 65
f[102]: 0b 09 02 03 46 00 07 0f 0a 13 14 03 66 66
g[103]: 0a 08 03 02 47 01 06 0e 0b 12 15 02 67 67
h[104]: 05 07 0c 0d 48 0e 09 01 04 1d 1a 0d 68 68
i[105]: 04 06 0d 0c 49 0f 08 00 05 1c 1b 0c 69 69
j[106]: 07 05 0e 0f 4a 0c 0b 03 06 1f 18 0f 6a 6a
k[107]: 06 04 0f 0e 4b 0d 0a 02 07 1e 19 0e 6b 6b
l[108]: 01 03 08 09 4c 0a 0d 05 00 19 1e 09 6c 6c
m[109]: 00 02 09 08 4d 0b 0c 04 01 18 1f 08 6d 6d
n[110]: 03 01 0a 0b 4e 08 0f 07 02 1b 1c 0b 6e 6e
o[111]: 02 00 0b 0a 4f 09 0e 06 03 1a 1d 0a 6f 6f
p[112]: 1d 1f 14 15 50 16 11 19 1c 05 02 15 70 70
q[113]: 1c 1e 15 14 51 17 10 18 1d 04 03 14 71 71
r[114]: 1f 1d 16 17 52 14 13 1b 1e 07 00 17 72 72
s[115]: 1e 1c 17 16 53 15 12 1a 1f 06 01 16 73 73
t[116]: 19 1b 10 11 54 12 15 1d 18 01 06 11 74 74
u[117]: 18 1a 11 10 55 13 14 1c 19 00 07 10 75 75
v[118]: 1b 19 12 13 56 10 17 1f 1a 03 04 13 76 76
w[119]: 1a 18 13 12 57 11 16 1e 1b 02 05 12 77 77
x[120]: 15 17 1c 1d 58 1e 19 11 14 0d 0a 1d 78 78
y[121]: 14 16 1d 1c 59 1f 18 10 15 0c 0b 1c 79 79
z[122]: 17 15 1e 1f 5a 1c 1b 13 16 0f 08 1f 7a 7a
        m  o  d  e     f  a  i  l  u  r  e

The shift word without the salt turned out to be the solution to puzzle-1 and moving on to puzzle-2.

Puzzle-2: Velociraptor escape

This puzzle felt to be the easier of the four. The puzzle consisted of a game called velociraptor escape, presumably based on the xkcd comic shown here (velociraptors in xkcd being a common theme):

The game displayed a stick figure in a square grid maze surrounded by velociraptors and an EXIT square. There were five levels in total and with each level the complexity and difficulty of the maze increased.

Solving the levels required inputting a sequence of commands to make the stick figure move trough the maze for each cycle of the maze program; avoiding any velociraptors that moved across the grid in unpredictable patterns every cycle. Each level also contained a max complexity and computations limit and I therefore assumed some form of code had to be written to compute the actual movement patterns.

Looking trough the source revealed some built-in command functions that allowed the stick figure to move in a direction or pause. There were also a GET request for the Grid layout for each level in a JSON array (each cycle of the grid being one frame).

{dimensions: [15, 15], limits: {code: 180, compute: 450}, rate: 200, start: [8, 6], finish: [14, 10],…}
dimensions:[15, 15]
finish:[14, 10]
frames: [[
  [1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1],
  [1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1],
  [1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0],
  [1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1],
  [0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1],
  [0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0],
  [0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0],
  [1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1],
  [1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0],
  [0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1],
  [1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1],
  [1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0],
  [1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1],
  [1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1],
  [1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0]
  ...]

Trying out the built-in commands only resulted in a message saying: “Use fewer built-ins”, so that clearly didn’t work. At least I now had the Grid layout for the levels, allowing me to see the grid position a velociraptor would move to on the next cycle. By looking one grid frame ahead I could map out the grid for a safe path and construct the movement pattern.

Examining the source further revealed a library catalog containing the source files: interpreter.js, language-grammar.js, language-structure.js and parser.js. Looking in the language-grammar revealed some higher level language concepts the interpreter were using and that would probably be of help when supplying the movement patterns for the maze.

// higher level language concepts
'program': '[ extendedSpace ], statements, [ extendedSpace ]',
'statements': 'statement, { newlineStatement }',
'newlineStatement': '\
  spaceNewlineSpace, [ extendedSpace ], statement \
',
'statement': 'assignment | return | function | ifElse | if | call',
'function': '\
  identifier, [ space ], parameterList, [ extendedSpace ], block \
',
'call': '\
  identifier, [ space ], argumentList | \
  builtInFunctions, [ space ], argumentList \
',
'parameterList': '{ fatArrowIndentifier }',
'argumentList': '{ arrowExpression }',
'fatArrowIndentifier': 'fatArrow, [ space ], identifier, [ space ]',
'arrowExpression': 'arrow, [ space ], expression, [ space ]',
'ifElse': '\
  expression, [ space ], block, [ extendedSpace ], \
  elseWord, [ space ], block \
',
'if': 'expression, [ space ], block',
'block': '\
  leftBrace, [ extendedSpace ], statements, \
  [ extendedSpace ], rightBrace \
',
'return': 'returnWord, [ space ], expression',
'assignment': 'identifier, [ space ], eq, [ space ], expression',...

It took a bit of experimenting to figure out how the language was structured and what I needed. I started off with building simple expressions that would be easier to understand and construct. Eventually I could construct a function expression using a built-in command for movements.

Built-in:// config
  var UP = 0, RIGHT = 1, DOWN = 2, LEFT = 3, PAUSE = 4;'move': function(direction) {
        queueMovement(direction);
        return undefined;
      }-------------------------------------------------------Function expression:m => n 
{ move -> n }m -> 2 // call 'm' with parameter 2, i.e. move 2 (DOWN)

Using this simple function expression made completing the first four levels, staying within the limits of the code complexity and statements, quite easy. **(Each level was meant to introduce a new feature of the script language, but because of a bug in the puzzle interpreter my code would work on the first four.)

The last of the five levels seemed to be more difficult because of the mere size of the grid. So I decided to write a search algorithm that would work backwards trough the frames and calculate a safe path. However in the process of this, since it was already getting late, I got tired. I decided instead to write a simpler solution in Node that allowed me to manually work trough the maze, but blocking me if I tried to do a wrong move. Each successful move got logged in an array and if I got stuck I could print out the movement array and start over.

My new program allowed me to finish the level in three tries, the pattern turned out to be the Fibonacci sequence. Considering that it only took five minutes to write (and that I managed to get to bed for work on time) I called it a great success! (Link to the code)

The solution to the puzzle was given as a code once the last level of the game was completed. Onward to the last puzzle.

Puzzle-3: Building Narratives

The last and final puzzle were definitely, the harder of the bunch; or at least the one I spent the most time on.

Once again there was a page and this time with instructions that read:

The biggest season of the Sportsball competition just ended, and we’re headed into the Post-season. The Grand Sportsball Competition is coming up this year! Some serious money is on the line. Who’s gonna win the big sportsball game? Figure it out, and you could win big.

There were some statistics on the page that claimed to be generated by a random number generator for a fantasy softball league. Last years ranking was displayed at the top, on the bottom some highly irrelevant graphs and a submit predictions button.

In the source code there was a hidden input form and once the HTML type was changed to something other than hidden: it would reveal an input form where you could enter an array containing your predicted input (integers between 0–16, each representing a team and rank in the tournament).

The links in the navbar both redirected to the same comic strip.

https://www.explainxkcd.com/wiki/index.php/904:_Sports

Looking at the source code showed, under a file named: betting.js, the following function.

function getRandonNumber(nums) {
  // return 4; // chosen by fair dice roll.
  //           // guaranteed to be random.
  return nums.shift() / Math.pow(2, 52);
}

Where the comments in the code was a reference to another one of xkcd comics.

https://www.explainxkcd.com/wiki/index.php/221:_Random_Number

Both of these comics are hinting to random number generators that are in there nature not random at all, but instead weighted so that the number distribution will favor certain values more than others; or in other ways predictable. I assumed this was a key aspect of the puzzle, but I needed to find more information to verify my suspicion (it might just be creators of the puzzle attempt at trolling us).

Digging a bit deeper into the source code revealed that the model, i.e. the data, for the web application was being served from a GET request to the route: https://sports.haxkcd.com/holmberd/xorshift128plus?_=1469653705917 were the number at the end is a ‘Unix time stamp’ equal to running Date.now(); in JS. The data itself being a JSON array with each element being an integer of 15–16 characters length.

https://sports.haxkcd.com/holmberd/xorshift128plus?_=1469653705917[306882631881431,
2539235545221967,
4154568766432961,
3365437643233864,
4281899285148544,
...

I also noticed that the ranking for last years result (at the top of the page) was created by taking the first of element in the data-array || Array.shift(); and running it trough a function called number_to_permutation. This function did exactly what its name suggested: it took a number and returned an permuted array with numbers 0–16, but we will come back to this function later.

function number_to_permutation(num) {
  var elems = [];
  var result = [];
  var num_teams = window.teams.length;
  var i;
  for (i = 0; i < num_teams; i++) {
    elems.push(i);
    result.push(0);
  }
  var m = num;
  for (i = 0; i < num_teams; i++) {
    var index = m % (num_teams - i);
    m = Math.floor(m/(num_teams - i));
    result[i] = elems[index];
    elems[index] = elems[num_teams - i - 1];
  }
  return result;
}

Looking up the route name xorshift128plus showed that it is a class of pseudorandom number generator (PRNG) and that a PRNG-generated sequence is not considered to be truly random. I also found out that it is used by Chrome’s V8 engine random function and alas Node.js.

The PRNG-generated sequence is not truly random, because it is completely determined by a relatively small set of initial values, called the PRNG’s seed (which may include truly random values). — https://en.wikipedia.org/wiki/Pseudorandom_number_generator

XORSHIFT128PLUS:#include <stdint.h>/* The state must be seeded so that it is not everywhere zero. */
uint64_t s[2];uint64_t xorshift128plus(void) {
	uint64_t x = s[0];
	uint64_t const y = s[1];
	s[0] = y;
	x ^= x << 23; // a
	s[1] = x ^ y ^ (x >> 17) ^ (y >> 26); // b, c
	return s[1] + y;
}

Assuming that the data was generated by a xorshift128plus algorithm, purely by its name alone, was too big of an assumption to make. I decided to put it trough some tests to make sure it just wasn’t some other form of predictable data, or just another crypto.

First I wrote a small function that would request a new random data-array every 20 ms, run it trough the number_to_permutation() function and push each result in an array. After running for 10.000 request it would map all the array values based on team number and ranking. Finally it would calculate and show the distribution based on the number of times a teams number occurred at a given ranking. If the data distribution values were weighted in some deterministic way, or hidden in some other complexity, the result of this test would tell me something. The result after running my code were inconclusive, so I decided to proceed for test no.2. (Link to test-1 code)

My second test just meant running the data trough a dieharder test: which is a battery of statistical tests for measuring the quality of a random number generator. All the test showed good result for random quality and based on this I assumed that the random number generator was running the xorshift128plus algorithm.

Ubuntu man pages for dieharder: http://manpages.ubuntu.com/manpages/precise/man1/dieharder.1.html

For awhile I was uncertain how to proceed until I came across this blog (recommended read for anyone interested) about hacking the LA Times JavaScript Lottery. There the author mentions how the original state, i.e. the seed, for the xorshift128plus generator can be recovered by calculation from the random numbers already generated. This could also in turn be used to generate further keys from the same state. On his blog he had linked to a working Python example script on his github account: https://github.com/douggard/XorShift128Plus.

Even tho I mostly write my own code in JS/Node these days, the Python script turned out to be fairly easy to understand. In summary the script needed at least three generated random numbers as parameters and the type of browser engine that had generated them. The browser engine parameter required because of the difference in how Firefox and Chrome’s engines deals with converting 64 bit integers to doubles, explained nicely in the blog post mentioned above.

I assumed the server were running Node.js as before (which runs on the Chrome V8 engine) and therefore configured the script to run with the default Chrome. Next I generated a new array of data from the xorshift128plus route, choose three numbers from the data and ran them trough the getRandonNumber() function that I mentioned earlier (to divide them by 2⁵² to get the actual decimal number), and ran it trough the script, result shown below.

holmberd $ python xs128p.py
BROWSER: chrome
[0.9233837859218073, 0.4950068607291336, 0.9495068196539238][ostate0 = 12249876624380227944,
 ostate1 = 18016634099019338747, c926717972609807 = True,
 c2149474067303403 = True,
 c3779884477438292 = True][0.9233837859218073, 0.4950068607291336, 0.9495068196539238, 0.9546591844623613, 0.4845468892357394, 0.8286729746233681, 0.23621349859457674, 0.14539345927011627, 0.3716901298477995, 0.477300001006588, 0.404759709141816, 0.5639828186111149, 0.5011396805974779, 0.15851217670274242, 0.01587293787737032]

The ostate0 and ostate1 being the original state that was used to generate the current state of random numbers and the bottom numbers being the newly generated random numbers from the same state.

I spent a long time trying to figure out the meaning of the seed. At first I thought it would be something like the cipher on puzzle-1 where the shift word was the key to the puzzle, but it turned out that was not the case here. Eventually I figured out that if a choose the last three numbers in the data-array as parameters, it would generate five new random numbers that were not in the original array. The first number of these, i.e. the first random generated number outside of the boundaries of the array, was presumably the key number. (the numbers of elements in the array being 200, what I wanted was element number 201 because of how last years ranking function permuted the first of the array elements).

I selected this number[201] and multiplied it by 2⁵², to represent the power of the data-array integers, and ran it trough the number_to_permutation() to get an actual array containing the team number rankings that could be submitted as the prediction.

var key = 0.01587293787737032 * Math.pow(2,52);
var result = number_to_permutation(key);console.log(result); // output: [4, 10, 8, 3, 16, 5, 13, 1, 7, 11, 2, 0, 9, 15, 6, 14, 12, 17]

Once the correct prediction was submitted a code was generated on the page that was the solution to the last puzzle.

That’s it until next years puzzle, hope you enjoyed it!