The Space Shuttle Columbia disintigrating over Texas on February 1, 2003. CREDIT: AP Photo/Scott Lieberman

The Story of a Standard

Engineers have high standards. We also USE a lot of standards. By “standard” I’m talking about documents created by various technical bodies that spell out best practices and help to “standardize” how we perform our work. Standards come about for many reasons. They could be created due to a general consensus within an engineering community that one is needed to ensure basic design requirements are met. Some standards, however, are born as a response to tragedy. This is the story of a engineering standard created in the wake of one of the greatest disasters in the history of human space flight.

The standard I’m taking about is the NASA Standard for Models and Simulations (affectionately known as NASA-STD-7009). The disaster was the disintegration of the Space Shuttle Columbia (STS-107) during reentry on February 1, 2003. How are the two connected? To understand that, let’s look back at that mission.


The Last Flight of Columbia

STS-107 launched the morning of January 16, 2003 at 10:39 am on a 16 day mission dedicated to physics, life, and space science experiments. The seven astronauts, including the first Israeli astronaut, would conduct approximately 80 experiments working in shifts, 24 hours a day. It was one of the few shuttle missions of that time dedicated to pure science rather than construction of the International Space Station.

STS-107 Mission Patch. Credit: NASA

About 81 seconds after launch, one large, and at least two smaller pieces of foam insulation separated from the shuttle external tank and struck part of the shielding on Columbia’s wing designed to protect the shuttle from the intense heat of reentry. In spite of that strike (which was not discovered until a couple hours after launch) Columbia reached orbit and the crew began their marathon of science.

When engineers reviewed images from launch (a standard practice after all launches) and noted the insulation striking the shuttle, NASA began to assess whether or not it had damaged the spacecraft and therefore put the crew at risk when Columbia returned to Earth. The main threat being that the heat protection had been compromised and could result in the vehicle burning up during reentry. A Debris Assessment Team was formed to review the data and make an assessment of the risk posed by the foam strike. The team requested imaging of the shuttle in space and performed analyses to determine the possible extent of the damage caused by the foam.

Image of insulating foam debris from STS-107 launch. Credit: NASA

The Debris Assessment Team was unable to obtain on-orbit images of Columbia. Based on their analyses, past experience and engineering judgement the Team members concluded that the foam may have caused localized damage to the thermal prosecution system, but not enough to be a threat. Those results were presented to the Mission Management Team who agreed that the debris strike didn’t pose a significant risk to the safe return of Columbia.

Meanwhile in space, the STS-107 astronauts on Columbia continued their busy science schedule. The rest of the mission progressed flawlessly. Early on February 1, after almost 16 days in space, the crew of Columbia strapped themselves in and prepared to return to Earth.

To the Columbia crew and Mission Control, the re-entry into the Earth’s atmosphere began with no issues. By 8:53am, Columbia passed over the California coast, traveling at Mach 23 and an altitude of 231,600 feet. At 8:54am began to see some error reports from flight sensors and at 8:58am, the shuttle began to shed thermal protection tiles. At 8:59:32 am, Mission Control lost contact with Columbia and videos made by observers showed Columbia disintegrating.

Emergency response crews were activated shortly afterward. When it became apparent that there were no survivors, teams began working to recover the shuttle debris and begin the long process of piecing together the exact cause of the accident. Ultimately they would recover a little over 1/3 of the lost shuttle.

Recovered peices of the space shuttel Columbia. Image released May 15, 2003. — Credit NASA

The Investigation

Within hours of the accident, the NASA Administrator called for the formation of an independent team, the Columbia Accident Investigation Board (CAIB), to investigate the loss of the shuttle and seven crew members. The team spent almost seven months investigating the accident before issuing the final report. In that report, the board described the vast team that contributed to the investigation:

A staff of more than 120, along with some 400 NASA engineers, supported the Boardʼs 13 members. Investigators examined more than 30,000 documents, conducted more than 200 formal interviews, heard testimony from dozens of expert witnesses, and reviewed more than 3,000 inputs from the general public. In addition, more than 25,000 searchers combed vast stretches of the Western United States to retrieve the spacecraftʼs debris.

Columbia Accident Investigation Board Members — A portrait of the members of the CAIB. Standing (from left): Dr. Douglas D. Osheroff, Major General John L. Barry, Rear Admiral Stephen A. Turcotte, Brigadier General Duane W. Deal, Major General Kenneth W. Hess, and Roger E. Tetrault. Seated (from left): G. Scott Hubbard, Dr. James N. Hallock, Dr. Sally K. Ride, Admiral Harold W. Gehman, Jr., Steven B. Wallace, Dr. John M. Logsdon, and Dr. Sheila E. Widnall. (CAIB)

The board also commissioned testing and analysis to reproduce the effects of the foam strike to better understand and predict the damage to the shuttle wing. In the end the CAIB gave the following summary of the cause of the loss of Columbia:

The physical cause of the loss of Columbia and its crew was a breach in the Thermal Protection System on the leading edge of the left wing, caused by a piece of insulating foam which separated from the left bipod ramp section of the External Tank at 81.7 seconds after launch, and struck the wing in the vicinity of the lower half of Reinforced CarbonCarbon panel number 8. During re-entry this breach in the Thermal Protection System allowed superheated air to penetrate through the leading edge insulation and progressively melt the aluminum structure of the left wing, resulting in a weakening of the structure until increasing aerodynamic forces caused loss of control, failure of the wing, and breakup of the Orbiter. This breakup occurred in a flight regime in which, given the current design of the Orbiter, there was no possibility for the crew to survive.

Before and After images of foam impact testing on Space Shuttle wing leading edge taken from the space shuttle Enterprise. Credit: CAIB Photo by Rick Stiles 2003

But finding the physics of the Columbia tragedy was only part of their work. There were many factors that lead to the environment that facilitated the accident and many other factors that contributed to underestimating the risk of the foam strike. In the end the CAIB report gave 29 recommendations based on over 100 findings to return the space shuttle to flight and get NASA back on course.

How does this relate to the need for a standard for models and simulations? The seeds of that effort came from findings based on the following incident.

The Wrong Tool for the Job?

When engineers learned about the foam strike they requested an analyses from the Debris Assessment Team to determine the likelihood of damage to the shuttle. Generally to perform those kinds of analyses, engineers used a tool called Crater which was designed to look at damage caused to thermal protection systems from ice, foam, and metallic debris. The software had been calibrated for objects of about three cubic inches hitting thermal protection tiles.

The foam piece that hit Columbia was about 400 times that size.

Figure from the CAIB report showing the extrapolation used in the foam strike analysis. Credit NASA

In addition, the individual running the software, though trained to use it, had limited experience with the tool and limited access to expert support. Initial results showed that the foam strike could fully penetrate the tile, but other engineers, who knew the software to be conservative, disputed that finding reasoning that the denser layers of the thermal tiles, which were not represented well in Crater, would stop penetration.

There were other uncertainties involved in the analyses including exactly where the foam struck, its speed, and the angle of that strike. It was also possible that the foam struck the Reinforced Carbon/Carbon (RCC) heat shielding on the leading edge of the shuttle wing. This was a particularly sensitive section of the wing that saw the highest heating during reentry. Engineers used another tool designed to predict damage to RCC due to impact with ice. Since the foam was less dense than ice, engineers had to extrapolate data to make the tool fit the analysis needs.

Space Shuttle Heat Protection System — Credit: NASA

The results of the Debris Assessment Team analyses and data were presented to the manager in charge of assessing shuttle technical issues during flight (the Mission Evaluation Room manager). Ultimately the team decided that the foam strike, while it may have damage thermal protection tiles, would not pose a threat to the mission. The Mission Evaluation Room manager took that recommendation to the Mission Management Team. After a verbal summary of the analyses performed and conclusions the Mission Management Team accepted the conclusions of the Debris Assessment Team without further technical questions or requests to review the analyses.

The CAIB report noted several concerns about that decision process, including that:

  • Engineers used modelling tools well outside the ranges for which they were correlated
  • The engineer using the tool had limited experience
  • Engineers ultimately relied more on judgement and past experience than validated analysis tools
  • The full analysis results, data, and associated uncertainties were not presented to the officials who made the ultimate decision on the safety of the shuttle and crew.

Beyond CAIB

The recommendations and findings from the CAIB were specifically intended to address issues surrounding the space shuttle program. However, NASA wanted to know if there were items from that report that could more broadly apply to other NASA programs. For this task, the administrator brought together a team of senior NASA leaders to look at all the CAIB recommendations, observations, and findings and see which ones could benefit all of the space agency. That group became know as the Diaz Team after the team lead, Al Diaz. Their report, “A Renewed Commitment to Excellence” identified several opportunities for improving NASA. Based on the CAIB findings with respect to the Debris Analysis Team, the Diaz Team recommended that NASA “Develop a standard for the development, documentation, and operation of models and simulations.

Cover of the “Diaz Report” — Credit: NASA

Specifically it called for the new Standard to address the following areas:

  1. Identify best practices to ensure that knowledge of operations is captured in the user interfaces (e.g. users are not able to enter parameters that are out of bounds).
  2. Develop process for tool verification and validation, certification, reverification, revalidation, and recertification based on operational data and trending.
  3. Develop standard for documentation, configuration management, and quality assurance.
  4. Identify any training or certification requirements to ensure proper operational capabilities.
  5. Provide a plan for tool management, maintenance, and obsolescence consistent with modeling/simulation environments and the aging or changing of the modeled platform or system.
  6. Develop a process for user feedback when results appear unrealistic or defy explanation.

The Diaz Report was released at the end of January 2004. In September 2006, NASA Chief Engineer, Chris Scolese, released a memo further directing the Modelling and Simulation standard to “include a standard method to assess the credibility of the M&S presented to the decision maker when making critical decisions (i.e., decisions that effect human safety or mission success) using results from M&S.

Given those directions, a team of experts from across NASA was assembled to draft the Standard based on the guidance given by the CAIB, Diaz Team, and the NASA Chief Engineer. In July 2008, after extensive agency review, NASA-STD-7009 was released into the wild for use by NASA and any other organizations that wished to adopt it.

It would be hard to call any engineering standard compelling reading. And, unless you’re Mr. Scott from Star Trek, engineering standards are probably not your preferred form of literature. Still, the standard does provide very good guidance for those who develop, use, and review models and simulations. It establishes requirements and best practices for how to create, document, store, review and understand the credibility for a broad variety of models and simulations, not just the kind that were used by the Debris Assessment Team. So, if you fall into one of those categories, you might want to add NASA-STD-7009 to your summer reading list.

Foresight and Hindsight

Engineers try to foresee all eventualities that might pose risks to the missions they support, the products they produce, and the people who use those produces. But engineers also know that it’s impossible to foresee everything and to try to do so would mean they would never be able to produce anything! Standards are one way that engineers try to minimize risks associated with those possibilities. When bad things do happen, either due to lack of foresight or lack of insight, engineers modify existing standards or create new ones to help ensure that they won’t make the same mistakes again. Making a mistake may be tragic at times, but to fail to learn from that mistake would be even more so.

The Crew of STS-107: Rear (L-R): David Brown, Laurel Clark, Michael Anderson, Ilan Ramon;
Front (L-R): Rick Husband, Kalpana Chawla, William McCool. Photo Credit: NASA

Feel free to hit the recommend button (♡ icon below) or share if you enjoyed this article! Thanks for reading! #SASEPrints