Ever since I was little, I’ve been geeking out over NASA and space missions. About a year after I moved to the US, before I could even speak English fluently, I went door-to-door in my little town of Muncie, Indiana to sell raffle tickets and chocolate to raise $200 to fund my school trip to Space Camp in Huntsville, AL. In high school, I was riveted by the NASA documentaries and movies that Mr. Engel showed in science class. The moon landing was always one of the most inspirational stories I recall.
All of those stories about the Apollo missions always focused on the astronauts. Maybe, occasionally, once in a while, Gene Kranz would, perhaps, get a nod. So I never expected that the most exciting story about NASA would have little to do with the stars of the show.
Recently I read a Wired article called Apollo 11: Mission Out of Control. I won’t repeat it here because I won’t do it justice, but basically it was the unknown story of how Armstrong and Aldrin almost got stranded on the moon — all because of a software bug. As a software engineer, so much of that story still rings true for me today.
In that critical moment, hurtling like a lawn dart toward the surface of the moon, the Apollo guidance computer had crashed.
The final moon-landing nearly went sideways because of unintended user behavior combined with a non-deterministic bug.
…the dial for the rendezvous radar had been turned to the wrong setting…because of a design defect, every once in a while the system would bombard the computer with unnecessary requests…during the most difficult portion of the landing, 13 percent of the computer’s resources had been stolen…overloading the processing queue and forcing the restarts.
This is also a great explanation for Finagle’s corollary, “Anything that can go wrong, will — at the worst possible moment”. The worst possible moments are usually when the system is most stressed so that intended user behavior or other triggers of rare edge conditions will greater impact.
Brutal Prioritization Saves Lives
But the final moon-landing was also saved by software, because the designers were so good at prioritizing.
Fortunately, the programmers … focused on the critical tasks of navigation, guidance, and control … trumping even the software that ran the display
Out in the cold of space, the only thing that can save you is even colder logic.
Who do you delegate to matters
Armstrong and Aldrin phoned in to mission control when the guidance system first crashed. At that time there was still a window to abort. The go-no-go call in the middle of the descent was ultimately made by mission specialists. Not by Gene Kranz (the big boss) and not by the MIT programmers that built the software. Gene Kranz didn’t know enough about the details to make a call. And the programmers only thought about the software. The best people to make decisions are:
- Knows the operational details of the system
- Not emotionally compromised
- Understand the broader context
I think from our point of view, at MIT, something was missing inside the computer, something unknown was seriously affecting our software…But maybe we knew too much! Those guys could only see it from the outside. In a way, it was easier for them, and I think they got it right.
Other Tidbits, Just for Fun
I had forgotten the origins of Moore’s Law:
The Apollo program [ordered] hundreds of thousands of Fairchild components. The demand for miniaturization had led Gordon Moore, Fairchild’s head of R&D, to hypothesize that the number of components on an integrated circuit would double every year
Users often complain about cryptic error messages from their technology products. This was a vestige from the days when screen sizes were tiny and displaying a lot of text was cumbersome. Moreover, the software users of those days were often (but not always) well trained.
Aldrin hurriedly keyed in the two-digit code 5–9-Enter, which translated, roughly, as “display alarm.” The console responded with error code “1202.” Despite his months of simulations, Aldrin didn’t know what this one meant; Armstrong, equally baffled, radioed Mission Control for clarification.
Space is Still the Final Frontier
Technological progress has completely revolutionized the world since the moon mission. My iphone has more computing power than the whole Apollo spacecraft. Millions of people around the world now develop software enjoyed by billions. But once in a while, a great story comes along to remind us that software is still software. And space, be it Mars or beyond, will continue to capture our imagination for generations to come.