Don’t forget to protect your hardware from the power cosmic

CWE Program
4 min readDec 9, 2021

--

Photo credit: https://owlcation.com/stem/What-Are-Cosmic-Rays-and-What-do-They-Reveal-About-the-Universe

If you are making hardware that will be relied on for calculations or that could impact lives or money, you need to make sure that these products can recover from a single event upset (SEU). An SEU happens when a single bit is flipped in hardware as a result of being struck by an ionizing particle-like cosmic rays and radiation.

To provide a clearer picture, a single bit flip in an industrial control system could cause serious harm while a single bit flip in an authentication process could lead to elevation of privilege allowing an attacker or a process access to areas that might bypass controls. The identification of these types of anomalies goes back to at least the 1970’s with alpha particles generated by radioactively contaminated epoxy used for static RAM chip encapsulation, which caused random bit upsets.

The Real Trouble with SEUs

While not a critical system, a Super Mario Bros. Speedrunner (a competition for finishing the game in the minimum amount of time) example illustrates the strange impacts that can occur from SEUs. A player was able to move his character onto a higher platform that scored the quickest complete time. When no one could replicate this glitch, research determined that a single bit flip was responsible and unlikely occurrence of a SEU occurred.

For enterprise computing, SEUs can cause all kinds of issues. This 2010 blog from Oracle discusses a bit flip that occurred during the installation of a binary in Debian that caused a segmentation fault error. While it can’t be proved that the SEU came from radiation, a bit was flipped. SEUs are also happening in supercomputers at Los Alamos National Lab. For an audio version of some of the cases involving impossible vote counts (beware of vote errors in powers of 2) and deaths due to a breaking malfunction, go here.

In the medical field, implantable hardware needs to be protected against SEUs. This paper discusses SEUs in implantable cardioverter defibrillators, and the effect of radiation applied to ICDs based on a natural source, packaging sources, or radiological therapeutic sources. They attempt to calculate a frequency of upsets based on treatments that contain radiation.

There are even regulations in space and aerospace. In 2008, a plane from Singapore to Sydney experienced a bit flip that dropped the plane 656 feet in 20 seconds and injured passengers. In high altitude applications, these errors can occur frequently. NASA has a whole lab to test this in Greenbelt, MD. The LEON GR740, for example, is the latest European space-grade processor. The device is estimated to experience a staggering 9 SEUs a day on a geostationary Earth orbit.

How to Prevent or Recover from SEUs

The first thing you can do is design chips to be radiation hardened by design (RHBD). This paper discusses methods on how to harden inverter, NOR and NAND gates. A second method is to use triple module redundancy. This technique involves a voting scheme where calculation is done by three different logic gates or computers, and when at least two of the three agree, an answer is provided. This can be extended to a Byzantine Fault Tolerance algorithm across multiple voters with a maximum number of faulty voters.

The real questions are, what is the risk of failure, and is it worth the cost of protecting against this weakness? If your system is a simple toy, there is no need for protection. If your system involves a medical device keeping a person alive, the cost might be priceless.

In the case of bit flipping in Toyota Camry, experts determined that the inability to deal with a bit flip triggered liability on Toyota’s behalf for unintended acceleration. Make sure that your design or the hardware that you are purchasing has mitigations to prevent CWE-1261: Improper Handling of Single Event Upsets. Don’t be the person who brings liability to your company for hardware that won’t handle a SEU.

From a CAPEC perspective, these single errors have not been generated by attackers, and there are no known techniques for reliably generating such upsets, so there are no attack patterns defined. Adversaries are most likely not going to irradiate hardware intentionally to cause errors (although this would make a fun movie scene). Despite this, nature has a funny way of causing errors in strange places that cascade into serious errors with reliability implications. So, if you are making hardware, make sure that it meets the needs of its intended environment

--

--

CWE Program

The official blog of the CWE Program. Articles are written by program staff and our community partners. https://cwe.mitre.org