Adding Some Scifi Fun to a Book About Databases
Somehow, while writing a book about PostgreSQL, I decided I needed to weave in a science fiction narrative and a heavy dose of planetary science. I ended up with ~400 pages devoted to Postgres, Cassini and Enceladus, one of Saturn’s most mysterious moons.
A Thug’s Life: Becoming a DBA
6 months ago I decided to write a book about data and databases from a DBA’s perspective. Not many people I know (read: 0) decided at the start of their career to become a DBA; they just assumed the position because it was needed. I’m sure there are people that decided to become a DBA out of school, but for the most part (in my experience) this is something that just happens.
I started out by telling my story. How out of school I became a geologist at an environmental company and was the one person on the team who knew Excel well enough to take care of the analysis results from a huge cleanup job. That dataset grew to over 50,000 results (huge for Excel at the time) so we stuck it into Access, I bought some books, and boom: I was a new DBA.
People find themselves in this position all the time. They know SQL a bit, or perhaps they know Postgres (or MySQL, Oracle, SQL Server, whatever). When a report needs to be created or maybe a troublesome query optimized, they jump in to help. Pretty soon, as the company grows, that becomes their main job.
There really is no training for this kind of thing. Sure, there are blog posts and books you can read about learning SQL and how to optimize things, but there is nothing about the mindset you cultivate over time as a data person.
That’s what this book is about.
But how do you communicate such a thing? Sure: I could relate my personal stories, but to be truthful they’re kind of dull. I decided to spice things up by using an interesting dataset, one that people would be familiar with and understand a bit. The test databases out there are fun, but they’re so contrived (DVD Rental, Northwind, Chinook) that they’re almost useless. Based 100% on theory, they lack any real-world suckiness that every DBA has to deal with.
I decided to have a look at the StackOverflow Data Dump, specifically for https://scifi.stackexchange.com. I’ve used that before and it’s kind of entertaining… it also doesn’t fit the overall idea. I wanted to immerse the reader in the very real-life (and likely) scenario of inheriting a database and having to learn things on the job, under pressure.
All of a Sudden: Enceladus
Motivated by space “stuff”, I decided to have a look at the Cassini dataset. It had just burned itself up in Saturn’s atmosphere and there was a lot of interest in the mission. So I headed over to the Planetary Data System and had a look at the data that I could use… and was blown away.
There was data from a thing called a “Cosmic Dust Analyzer” (how great is that name!) and the Ion Neutral Mass Spectrometer (INMS) — all focused on “sniffing space” around Saturn, trying to find out more about the planet and its moons.
That’s when I stumbled on Enceladus.
Here’s the quote that I read, which sent me off a literary cliff:
Enceladus has no business existing, and yet there it is, practically screaming at us, ‘Look at me, I completely invalidate all of your assumptions about the solar system’!
That was in September. The next thing I know, I’m consumed by anything and everything there was to read about Cassini, Enceladus and Saturn.
Meet Dee Yan
At this point I wanted to have a main character that was mirroring my experience learning about Enceladus and Cassini. I decided to go with an amalgam of various friends of mine (who shall remain anonymous). One friend in particular embodies the smarts, drive and wit that I felt this story needed, so I based the character almost completely on her. I mixed in a few experiences from my life and that of other friends (because you need to write what you know) and Dee came to life.
Quite a few people have asked me about her, and why I chose to have her do the things she does, or to react the way she does. All I can say is this: I had no control over it. Dee wrote herself into existence. Yeah, I know that sounds. All I can say is that it’s true.
Dee is 28 and needs a job. She’s not at all confident in her future and her industry, but she loves astrophysics and Python. She lives with her mom in the Marina district of San Francisco so she can afford an internship at Red:4, a fictional aerospace startup of which I am the CTO.
On one lovely morning, Dee gets a message from me on Slack, asking her to drop by my office when she can. I let her know that our current DBA has just become our former, and I need her to fill in for a few weeks while I find a replacement. Her job is to assemble data from the Planetary Data Store and import it into PostgreSQL, normalizing it and fixing any errors she might find. We’re bidding on a massive project which that data will be a part of, and our freshly-fired DBA just destroyed over 3 weeks of work on it. It needs to be reassembled while I go and buy more time from the potential client (JPL).
The Story Grows
I’m not a fiction writer, but I like to write. I’ve never had a situation where a story like this one just sprang out of nowhere. It’s terrifying, to be honest, as it’s like singing in front of a crowd for the first time, or trying to do standup comedy (both of which I’ll never do). It’s a creative effort that could utterly suck, or it could be interesting. I’m not being saccarine when I say that this book would never have been written if it weren’t for my friends on Slack.
My editor, Dian, was unbelievably helpful in every way. I never knew editors could be so helpful and needed. I’ve only ever used an editor while writing technical books for large publishers; and that usually involves asking them to get out of the way. Dian was much different. Her literary skills blew my mind, and the fact that she’s a working PostgreSQL DBA helped as well!
I had never read anything like this before: tons of code and walkthroughs wrapped around a story about an icy moon which might host life.
We dive into trigonometry so we can calculate precise position and altitude during flybys, we inspect Cassini’s master schedule so we know which dates were devoted to Enceladus (and what they did there), and we roll together some interesting chemical results from the INMS so we can see, first hand, why scientists think there very well may be life on this little moon.
In terms of PostgreSQL: we import raw CSVs with quite a number of data entry errors and normalize them. We optimize queries over millions of rows of data with indexes, then pop a full text index down using a materialized view. We roll up the mission schedule using window functions to find out which of Cassini’s instruments were used most often, and write a PLPGSQL function containing the pythagorean theorem so we can solve a major problem elegantly with a common table expression.
I devoted 65% of the book’s overall story to Postgres and what it’s like to work with it on a daily basis. 30% of the book is devoted to Enceladus, Cassini and Dee’s life. The rest of it comprises a pervasive theme: what it’s like to work with data every single day, and the mindset that you must cultivate to spot problems.
The Feedback, So Far
The book has been on sale now for about 10 days, and the feedback has been generally positive. No one has told me to delete Word from my machine yet, so I suppose that’s a good sign.
Some people have found it entertaining and have enjoyed learning about space from the context of a person who works with data. Others have said they “finally understand SQL and why their DBA has been such a jerk” — that one made me laugh.
The book is for sale here, for $25.00. It’s in epub, kindle and PDF formats. I wrote it to be a fun/pleasure read as well as an informational resource. I was inspired very much by The Martian, and was hoping to capture some of that technical/mystery magic as well.
I hope you enjoy it!