Last Tweets of the Krell
Many readers are no doubt familiar with Forbidden Planet, the documentary film about the Krell civilization, which came to an unfortunate end just at the launch of what could have been their biggest achievement. Ever since the film’s release in 1956, xenoanthropologists have been stymied by a lack of source material on the Krell.
But today, April 1, 2017, researchers from the OpenNMT institute have announced a stunning success in translating previously undecipherable Krell messages, thanks to deep neural machine translation technology.
They note that the Krell language has three grammatical pluractionality markers, one for public speech addressed to everyone, one for cohort speech among a group of peers, and one for private speech between individuals. The researchers joked that these modes correspond to our Twitter, Slack, and text messaging. Krell names are unpronounceable for us, so we have chosen single-letter initials for the participants.
P: The president of an organization that we will translate as KrellTech.
A: The alpha technical expert at KrellTech.
S: A site reliability engineer who arrived before P and first sees the problems.
C: A co-worker at KrellTech who is critical of P and A.
P(to public) KrellTech is thrilled to announce this morning the launch of CognoMaterializer 1.0! The basic version of this service is available free of charge to all citizens of Altair IV. The possibilities are equal to the number ten raised almost literally to the power of infinity!
P(to cohort) Time to party! All KrellTech employees, friends, and family invited to the big bash at the Level 700 meeting room.
P(to cohort) Congratulations everyone, and special thanks to A for leading this project. Never have I seen a single individual successfully do so much, on their own, to create a complex project like this! The drinks are on me, A!
A(to cohort) Thanks, P! I’ll toast you for one drink but then I’ll have to crash — too many late nights bringing this project home! I’m fried!
Later That Night
P(to public) After midnight … we start day two of the CognoMaterializer era. A few reports of problems, due to user error. I promise we will have a fix by tomorrow for all affected users!
S (to cohort) @A, you better get in here fast. Some of these user reports look serious …
S (to cohort) @A, WHERE ARE YOU!!?? You don’t respond to messages, pages, or calls. I’m going to start a
rollback to the previous version.
S (to cohort) @A, or @C, or anyone in @cognoteam, what’s up with the
rollback script??? I tried to run it, and it just gave an error message.
C (to cohort) @A, why do you have unreviewed code from
experimental running in production? I didn’t even know that was possible!!??
C (to cohort) I warned everyone not to cut the budget for user testing, but did you listen? Noooo, @P said we had no budget left. We brought a few users in for one-hour tests, but we never did the week-long home tests I specified as essential. We’re going to have to pull the plug on this whole thing.
S(to cohort) I tried the
shutdown script, but the ping-and-restart restarts them faster than I can shut them down. Didn’t anyone try this before?? Can we contact our brick service provider and have them shut it down? Could we run around and pull all the plugs?
C(to cohort) umm, the brick is a 8000 cubic mile array of klystron relays powered by 9,200 thermonuclear reactors. I don’t think we can do a manual shutdown.
P (to public) Everyone, we’ve got a few minor issues — I recommend taking a pause from enjoying the CognoMaterializer until we get a quick update out. Just try not to think about anything bad. Or anything at all, really. And don’t go to sleep!!
C (to cohort) we’re 3-way-copulated
S (to cohort) we’re triple-3-way-copulated
Last Known Krell Message
S(to public) this is so bad … I don’t know if there is anyone left out there to see this. I’m sorry for the damage we’ve done to our civilization; I know we could have protected it if we had followed these practices:
- Many eyes. Don’t let one person work alone. All changes should be reviewed and approved by others.
- Understand use cases. The temptation is to concentrate on what you, as an engineer, do in building your service, but the truly important thing is what your customers will do when they use it.
- Engineers are judged by failure. Product designers are judged by how useful the product is when it is functioning. Engineers are judged by how gracefully the system degrades when it fails.
- Test what you build and build what you test. Make sure the use cases are tested, and that what you run is what was tested. Hermetic builds help assure that you know where everything came from, and can reproduce it.
- Adversarial testing. Be devious in imagining what could go wrong, and fix it before it does.
- Progressive rollouts. Trial the system with a small number of users before releasing to a large number.
- Monitor. Have enough people, and automatic systems, to gather feedback on how you are doing and sound an alert when there are problems.
- On Call: Make sure there is a schedule of qualified people on call to deal with problems, especially at the start and after any major change.
- Instant shut down and rollback. Make sure you have the ability to cleanly shut down the system if something goes wrong, and in case of error to roll back to the previous working system. Bonus points if you can reliably cherry pick new improvements and add them to a previous version.
- Practice emergencies. Don’t wait until a real emergency to see how good your response is; practice in advance.
- Safe User Interface. Don’t make it easy for users to do unsafe things. P insisted on a “one-thought” interface (and even patented it). It is safer to require a specific prompt to initiate action (“OK, Cogno”) and use a confirmation dialog for actions with big irreversible effects (“Are you sure you want to conjure a monster from the Id?”).
- Culture. C and I (and others) had concerns all along, but the culture encouraged us to keep quiet about it, discouraged discussion, and didn’t have a process to fix the problems.
- Postmortems. When something goes wrong, analyze and understand it … that is, if there is anyone left alive.