Years before it exploded, the Space Shuttle Challenger made a flight with ‘the fewest failures’

16 min readSep 23, 2020

An insider’s gripping look back at STS-8

Image of STS-8 mission logo from NASA’s image gallery.

“T-35 seconds and counting, T-31, we have a go for auto sequence start. Challenger’s 4 redundant computers now assuming primary control of critical vehicle functions from now through lift-off. T-20 seconds and counting, SRB engine nozzle gimbal profile, now underway. T-11, 10, 9, 8, 7, 6, 5, we have engine start, 2, 1. We have ignition and we have lift-off. Lift-off, 32 minutes after the hour and the Shuttle has cleared the tower.”

So began STS-8, the eighth mission of the Space Transportation System known as the space shuttle on August 30th, 1983.

Five days later, hours before sunrise, the phone rang next to Dayna Johnson’s bed as she slept in her apartment off of NASA Road 1 in metro Houston.

Johnson worked as an IBM Communications Co-op. Her role was to mingle with reporters at NASA press briefings at the Johnson Space Center, and to write and edit newsletters for the hundreds of IBM employees who worked on space shuttle programs in Houston and Cape Canaveral. She reached over and picked up the receiver on the phone, and pulled the cord toward her as far as it would go.

“Hello?” she asked.

In less than a month, the FCC would approve the first commercial portable cell phone — a Motorola DynaTAC 8000X that was over a foot long, weighed 1.75 pounds, took 10 hours to recharge and cost $3,995, the equivalent of $10,379 today. It would be years before cell phones were commonplace and affordable. For now, everyone had land line phones.

“Get to work now. We have an emergency,” she heard her boss saying. His name was Justin Fishbein. He was a Harvard grad and former reporter for the Chicago Sun-Times, and currently head of communications for IBM’s Space Shuttle Programs, which supplied the space shuttle’s onboard computers and software.

“How fast can you get here?” Fishbein asked.

“About 20 minutes,” Johnson said, and leapt out of bed to get dressed. What could be the emergency?

The crisis unfolds

While Johnson had been sleeping, space shuttle Commander Richard “Dick” Truly had been in contact with mission control.

“Roger, Houston, and I need to talk to you. We’ve had a redundant set split and, over,” Truly said. A set split meant that the redundant computers were no longer operating in sync.

It was five days, two hours and 22 minutes into the STS-8 space shuttle mission, and the Challenger was flying over Guam where one of the ground station antennas was located that enabled communication with mission control.

“Roger, copy, we’ll be looking at the data here in a second,” mission control replied.

“Okay, while you’re getting the data up let me tell you what happened,” Truly said. “GNC wasn’t doing anything except clicking along” when the set split happened.

GNC stood for guidance, navigation and control. It was such a critical aspect of the space shuttle that four redundant computers were used to handle GNC at liftoff. The computers were known as General Purpose Computers, or GPCs, and they checked each other over 500 times a second.

Once in orbit, only two of the computers would handle GNC in sync, while the third computer would handle life support and other tasks, and the fourth computer would be loaded with the descent software and powered down. This “freeze dried” computer was to be used only in case of an emergency.

“Since no jets are firing and we were just about to [fly over] Guam, I have elected to not take GPC 1 to halt. Both GPCs are still in run and it looks like a pure set split to me,” Truly told mission control.

“Okay, copy, standby one,” mission control replied.

Everyone involved with the space shuttle memorized dozens of acronyms to simplify communication. IBM’s co-op students — all of whom were in technical majors like aerospace engineering or software programming except for Johnson, a journalism major — jokingly called the little chocolate doughnuts they ate every morning LCDs.

“Okay, Dick, on your GPC, what we’d like for you to do is go to the Orbit Pocket Checklist, page 3–10, first GPC fail and consider the failed GPC as GPC number 1,” mission control said. The Orbit Pocket Checklist was a booklet filled with technical flow charts and step-by-step instructions for how to handle various processes and failures on the space shuttle.

NASA expected failures, and much of the training the astronauts did in a simulator was designed to prepare them for crises. Johnson had climbed into a space shuttle simulator at NASA on one of her first few days on the job, and had been surprised at the small size of the cabin. The astronauts had very little room in which to work.

“Okay, I’m turning to that page,” Truly said. “You want me to bring up the freeze dried, or stick with GPC 2?” If the freeze dried computer with the descent software was required, that could mean only one thing — it’s time to cut the mission short and descend.

“That’s a negative, Dick,” said mission control. “Do not bring up the freeze dried, and we’ve got about 20 seconds left here. We’ll see you at Hawaii in 7 minutes at 2 plus 37.”

Technically, a fifth computer with ascent and descent software was also onboard as a last-resort option. It was programmed outside of IBM in case all of IBM’s software had a generic error. However, to use the fifth computer would require abandoning the other four computers. If it were ever put to use, it would be an act of desperation.

The odd redundancy and reassigning of tasks to the primary four computers was due to the limitations of technology at the time. Each computer weighed 120 pounds and had less memory than a modern smartphone. They could only handle a limited number of tasks at a time.

The computers used for the earliest space shuttle flights contained only 106 kilobytes of memory, written as 106 K or KB. The space shuttle’s operating system and displays alone used 35 KB of memory, and overall the onboard software code required 400 KB. By contrast, a typical smartphone today has millions of KB of memory, and a laptop can have over a billion KB.

In sum, the computers in the 1980s didn’t have nearly enough memory for all of the space shuttle’s needs. The solution was to store the software code onto tape drives. The astronauts would use tape drives to load only the code necessary for each particular phase of the flight as it was needed. When each new phase started, the astronauts swapped in the code for that phase.

“This is shuttle control, Guam has loss of signal,” a NASA spokesperson at mission control announced, indicating that Truly’s conversation with mission control had ended.

While the space shuttle was flying around earth at 17,500 miles per hour, taking 90 minutes to circle the globe, communication with mission control was limited. During the first few flights, communication was only possible about 15 percent of the time since there were not enough satellites in space yet to allow near-continuous contact.

However, in STS-6 in April 1983 — the space shuttle Challenger’s maiden voyage a few months earlier — a Tracking and Data Relay Satellite (TDRS) had been deployed to increase the ability of the astronauts to communicate with mission control. Eventually NASA would attempt to deploy 10 of these satellites before the last space shuttle mission, STS-135, was flown on July 20th, 2011, and the entire space shuttle program ended August 31st, 2011. For now, NASA would have to make do with only one TDRS and its system of ground station antennas.

“During this pass, Challenger’s Commander, Dick Truly, reported a redundant set split of the general purpose computer,” the NASA announcer continued, summarizing the flight status for the record. “That’s where two computers assigned to the same activity disagree. GPC no. 1 is considered the failed computer in this instance. And Truly is now performing malfunction procedures, troubleshooting that problem. Hawaii is the next station in 6 1/2 minutes at 5 days, 2 hours, 30 minutes mission elapsed time. This is shuttle control, Houston.”

Crisis management launches

While the astronauts were performing malfunction procedures, news of the crisis had made its way from outer space to Fishbein, head of communications for IBM’s Space Shuttle Programs, and then to Johnson, the IBM Communications Co-op.

Since IBM had provided the space shuttle’s redundant computers and software, their failure to sync hurled IBM into crisis management mode.

Johnson drove past a white Saturn V rocket lying on its side in NASA’s Rocket Park at the Johnson Space Center as she made her way to her nearby office at IBM before sunrise. The Saturn V rocket was longer than a football field. It was used during NASA’s Apollo program that launched 27 astronauts into space from 1967 to 1973, including six successful missions with men landing on the moon. Despite its illustrious past, the Saturn V would remain outside exposed to the elements for decades before NASA began attempts to restore it and shelter it.

Once Johnson arrived at IBM, her boss rushed up to her to explain the situation. “Start thinking of any questions that you can imagine a reporter might ask. A press conference will be held in a few hours at the Johnson Space Center. I’ll need to be prepared with questions and answers before I go over there,” Fishbein said.

Fishbein was an imposing figure, even though somewhat short in stature. He had gushed praise over Johnson’s first newsletter article about Hurricane Alicia that had blown through Houston a few days before she arrived to start her co-op position. Johnson had written about how IBMers had helped each other with downed trees, electricity outages and flooding.

Johnson wanted to continue to impress her boss, so she quickly began to type up questions for him. She didn’t realize it, but her questions echoed the same questions that reporters had recently asked several key IBM employees in a panel interview for the Association for Computing Machinery (ACM). The transcript of the May 1983 interview would eventually appear in the September 1984 edition of the organization’s monthly magazine called Communications of the ACM.

The questions in the panel interview included: “Can you give us an idea of the failure rate of the GPCs?” “Can you explain what happens in a ‘fail to sync’?” “Have there been any unusual fail to syncs?” “Is there any single point of failure in the hardware system that could affect all five computers?”

Jack Clemons, IBM’s manager of avionics flight software development and verification for the shuttle onboard computers during the first few missions, had answered several of the questions. He had previously worked on Apollo and Spacelab, a European-built spaceborne science laboratory designed to go on space shuttle missions.

“The software has an operating system written in assembler and applications written in HAL/S, a high-order language developed by Intermetrics, Inc., of Cambridge, Massachusetts,” Clemons said at the time. He dismissed speculations that the software was named after the computer “HAL” in the movie 2001: A Space Odyssey.

“HAL/S is a real-time, structured engineering language that is very readable. Theoretically, at least, it should limit the amount of structure-induced errors because it makes the programmer pay attention to structure as the software is being developed,” Clemons said.

The government expected IBM would “deliver error-free code,” Clemons said, and had paid IBM hundreds of millions of dollars for the software. IBM had a team of roughly 100 software engineers and programmers writing the 500,000 or so lines of code, and about 80 people assigned to test and verify the code’s accuracy and reliability.

Despite achieving an exceptional accuracy record, the software never reached the lofty goal of being error-free. In fact, Clemons admitted that they had discovered the worst kind of flaw during astronaut training between the first and second space shuttle missions.

“All four of our flight computer machines locked up and went ‘catatonic.’ Had this been the real thing, the Shuttle would probably have had difficulty landing,” Clemons said. “This kind of scenario could only occur under a very specific and unlikely combination of physical and aerodynamic conditions; but there it was: Our machines all stopped. Our greatest fear had materialized — a generic software problem.”

Was the shuttle now dealing with another generic software problem?

Johnson considered it a race against time. What if one of the computers stopped working? What if they all stopped working?

Johnson typed her questions on an IBM Personal Computer XT (PC XT), a new model that had only been on the market for a few months. IBM’s very first PC had been introduced two years earlier in 1981 for a price of $1,565 — equivalent to $4,455 today. That was the same year as the first space shuttle flight, and the year Johnson had graduated from high school. The PC XT, IBM’s second Personal Computer, had 125 KB of memory and weighed 32 pounds. The operating system was DOS 2.0.

Johnson didn’t know any of these details. All she knew was that the computer was cutting edge. She was thrilled to be using the latest technology and making $16 an hour. “Dayna, I’m picking my jaw up off the floor,” her best friend had said when she heard about the pay. That $16 would be the equivalent of $41 now, a rate of over $85,000 a year — an impressive wage for an intern.

While Johnson drafted her questions, a team of IBM software engineers and programmers frantically scrambled to diagnose the problem aboard the space shuttle and figure out a solution.

Failures stack up

Thanks to CNN, which was founded in 1980, the space shuttle had been featured in the news up to 24 hours a day since its first mission in 1981. Before CNN, national news in the United States was covered in the nightly news hour on the traditional three networks of ABC, NBC and CBS.

Reporters came from around the world to get updates first-hand on the space shuttle missions from the Johnson Space Center in Houston where mission control was located. The World Wide Web (the internet as it came to be known) wasn’t invented until 1990, so gathering news was still a laborious endeavor typically done in person.

The anticipated press briefing that Johnson and Fishbein were preparing for quickly arrived. It was held at NASA’s Johnson Space Center on September 4th, 1983 at 12:30 p.m., less than 24 hours after the computer problem emerged. By then, the computer crisis had been resolved.

Harold Draughon, NASA’s flight director for STS-8, started with a rundown of the audio transcript from communications with the astronauts, and his assessment of how things were going.

“This particular mission has been extremely successful,” he said. “In fact essentially 100 percent, and failure wise as far as what the Orbiter has done, performance wise, it has been the cleanest spacecraft that I believe we’ve flown that has had the fewest failures, that’s my impression. It has had the fewest failures of any missions we’ve flown to date, with the Orbiter.”

Johnson was astounded at this flight overview.

In the audience were several renowned reporters, some of whom Johnson had already met. The reporters included Craig Covault, with Aviation Week and Space Technology; Lynn Sherr, with ABC; Mark Cramer, with CBS; Wayne Dolcefino, with KTRH; and Jim Carlton, with the Houston Chronicle.

Among these reporters, Johnson most admired Covault’s expertise, and Dolcefino’s astuteness and audacity. She was not surprised years later to learn that Dolcefino had gone on to win 30 Emmy Awards as an investigative reporter.

A NASA employee walked around the room holding a microphone up to reporters who had a question. They started to inquire about all kinds of things except the computers. Finally, someone brought up the subject.

Draughon, the flight director, downplayed the issue: “Everybody got their adrenaline going when we had the little computer problem, but like I say, in about 40 minutes, it was completely back under control and everyone knew where we were, and we were back making sure that we had picked up everything, and didn’t miss one thing that was scheduled in the timeline that goes with that activity.”

Perhaps because the computer glitch was downplayed, Johnson never noticed her boss get drilled with questions about the matter.

In a later press briefing, another NASA flight director Jay Greene, who had been watching all the downlink TV from the shuttle and talking to the astronauts, put it differently: “We did all the troubleshooting steps and took the computer dumps on the GPC 1 computer, we got it re-IPLed, or we got it started again in sync with GPC 2, we operated on it all night long.”

IPL, short for Initial Program Load, referred to loading the operating system of a mainframe computer into the memory of the computer. While the effects can be compared to rebooting a personal computer, it was more complex than that. The task was complicated by the lack of continuous contact that mission control had with the space shuttle.

Greene explained that the software had been corrupted by a hardware issue. “GPC 1 it turns out experienced what is known as a transient hardware hit. One bit, one 1 became a 0, a 0 became a 1 due to a transient failure in a hardware register within the computer,” Greene said. Software code consists of strings of 0 and 1 that tell a computer what to do.

“In doing that, an instruction that was otherwise valid was trans — was changed to an invalid instruction and the invalid instruction was recognized by the computer as invalid and the computer said I’m not to execute it and it went to the wait point. When it got to the wait point, the other computer was doing all the right things so it wasn’t there, the two computers were not in sync point together, and they said something is amuck. They spit out the set split indication and then they proceeded processing normally although not in sync with one another,” Greene said.

He told the reporters that the shuttle’s onboard computers were all functioning fine now, but a new problem had occurred with the Inertial Measurement Unit (IMU). The IMU was a type of sensing equipment that fed attitude and velocity data to the computers to aid navigation. Yet again, NASA and its vast team of experts resolved the crisis.

The Challenger would go on to land safely. The adrenaline rush would be over until the next space shuttle mission, when two onboard computers would fail at the same time, prompting the New York Times to run a story on December 9th, 1983 at the conclusion of STS-9 with the headline: “Shuttle Program Computer Failures Began on First Test Landing in 1977.”

The article would outline the major failures to date, including the very first space shuttle mission in April 1981, when a computer failure delayed takeoff for several days; the Challenger’s first mission in April 1983, when a computer broke down and sounded an alarm waking up the astronauts; STS-8 with the “garbled data” problem, as the sync issue was eventually described; and STS-9 in which two computers failed at the same time.

‘’When you have a problem like this you can’t know at the time whether it’s the computer or something leading up to it,’’ Fishbein told the New York Times.

The mission ends: an era becomes an omen

The anxiety that Johnson had experienced with the STS-8 computer malfunction was really the norm for anyone involved with the space shuttle. The flights were riddled with all kinds of failures, and yet considered a success as long as the team of experts could scramble to make things right.

The astronauts exhibited extreme courage to board the space shuttle, especially in the earlier missions when the technology was new, and communications with earth were sparse.

Shortly after STS-8 ended, NASA forwarded the spacesuit pants worn by commander Dick Truly to the National Air and Space Museum in Washington, DC. The current description on the museum’s website has a photo of the pants and says:

“Collection Item Summary: Astronaut Richard H. Truly wore this in-flight suit as commander of the six-day STS-8 mission aboard Space Shuttle Challenger in August 1983. Except during launch and reentry, Shuttle astronauts wear ordinary clothing as they live and work inside the orbiter. NASA issues identical blue cotton-blend jackets, trousers, and shorts for their in-flight wardrobe. Crews of the earliest Shuttle missions wore standard dark-blue shirts with their own mission emblem sewn on the front; later crews wore shirts of various colors and designs.”

It is unclear whether Truly’s spacesuit was ever on display at the museum. The museum website says, “Pants, In-Flight Suit, STS-8, Truly: Display Status: This object is not on display at the National Air and Space Museum. It is either on loan or in storage.”

Sending the commander’s spacesuit to a museum seemed like a mundane closure for STS-8, the eighth space shuttle flight. As for Johnson, she took her STS-8 mission patch and put it in a jewelry box for safekeeping, where it remains today.

In January 1986, the space shuttle Challenger met its end on STS-51-L, the the 10th flight of the Challenger and the 25th space shuttle mission. The orbiter exploded, killing all seven crew members on board, including teacher Christa McAuliffe. The explosion also destroyed a second TDRS satellite that was to be deployed to increase communications between the space shuttle and mission control. Johnson still had friends working at IBM in Houston, and she called one of them to mourn together.

Later in 2003, the space shuttle Columbia exploded on STS-107, killing all seven astronauts on board. In some respects, it seems that only luck had prevented the ultimate disasters from happening sooner, or more frequently.

From STS-1 in 1981 to STS-135 in 2011 — the 135th and final space shuttle mission — many other technologies came and went. From Johnson’s high school graduation in 1981 to her 30th reunion in 2011, an entire era had come and gone.

Johnson had thought the space shuttle program would last forever. She anticipated the demise as much as anyone would expect cars and airplanes to go out of use, and become a facet of history. Yet that’s what happened. However, the space shuttle technologies helped lay the foundation for many of today’s cutting-edge devices.

Computers and software now run GNC on self-driving cars, 18-wheelers, drones, robots, airplanes and more. Computers have become critical for everyday life.

How reliable are the computers and software that run the world? How safe are banking, transportation, utilities and weapons? Do they have redundant systems and near “error-free” software like the space shuttle had? Are there teams of experts ready to handle any glitch?

While computers were not the cause of the two space shuttle explosions, they inherently posed risks and caused some scares.

All of humanity is now, like the space shuttle astronauts, vulnerable to computers, software and technology, hoping luck will help avert the ultimate disaster as humankind flies into the future.

— — — — — — — — — — -

Author’s Note: This story is based on the author’s personal experiences as a Communications Co-op in IBM Space Shuttle Programs. Quotes from transcripts of space shuttle missions and NASA press briefings retain their original grammar and punctuation. NASA has an impressive collection of historical items online including audio archives and news releases. The image used with this story is from NASA’s image gallery.