Dissecting khash.h (Part 1 : Origins)

In which the author encounters a strange hashing library

Mali Akmanalp
3 min readAug 25, 2016

Like everyone, I love a good enigma. One such enigma is _why. Back in the day I was trying to learn Ruby, as was in vogue at the time, and I came upon a brilliant book called “why’s (poignant) Guide to Ruby”. This was a programming book laden with cartoon foxes, chunky bacon and avantgarde humor. It was as much art project as it was technical training. It even had a ridiculous soundtrack. The style was uniquely approachable and friendly. I’d never seen anything like it before, and was instantly smitten. (If you haven’t read it, you must!)

Anyway, later, I found out that _why actually authored a bunch of different software too. Among them was potion — a small and self contained language written entirely from scratch! It also had a bunch of wacky ideas, but at a few thousand lines of code, this was the first time I encountered a programming language that I felt comfortable digging into the code of.

I remember the directory listing immediately blowing my mind. For what I thought was a pretty complex language, there were a tiny amount of files. The “core” directory had everything right there. Syntax? syntax.g! Strings? string.c! Garbage Collector? gc.c! Virtual Machine? vm.c! At the time, most of it went over my head, but I got it to compile, played around with it, mucked about with the syntax a little. It was neat.

The other thing that stood out to me was how weird the code was. I’d seen some C code before, but I’d only seen strange and arcane stuff was reserved for the Linux Kernel, which I assumed did all that junk because it was somewhat special. Weird stuff and macros all over the place. But the file that really took the cake was khash.h:

What the hell are those triangles??? Are they ancient pyramids dedicated to a long-dead pharaoh? Is it merely a coincidence that “i&0xfU” looks like “Cthulhu”?

This was very different from what I saw in my CS classes. When I learned about hash tables, we used modulo. We handwaved away some details. Then we learned the names of a few hashing functions, and that was that. This was very much the opposite. Every line of code reeked of implementation detail.

The nice thing was that this file was small and self contained. No hunting down stuff necessary. I knew what it was supposed to do — add stuff, look up stuff, delete stuff.

A curious detail:

Another pseudonym! Turns out, like _why, Attractive Chaos was also an enigma. They maintained a blog with a ton of interesting stuff and also had written a ton of software. I was doubly intrigued.

Alas, at the time I struggled for a bit to decode it all and then I gave up. Now, I’m back, with a vengeance. This series will follow my findings.

I’ve discovered that there is a short explanation of khash.h in the Attractive Chaos blog, but it’s not very satisfying. I want to go through the whole thing line by line.

Thanks for reading, leave me a comment if you enjoyed this, Part 2 is here!

--

--

Mali Akmanalp

Programmer, enjoys python, open source fan, arch-enemy of messy data, all-round nerd. Fellow at @HarvardCID