LETS TALK ABOUT RISK BABY🎶🎶

Illustrating Engineering Ethics for Safety Critical Systems

Risk Introduced by Emerging Technologies

Nicolas Malloy
Published in
8 min readJul 16, 2019

--

In this article we consider the use of autonomy for a hypothetical system (the SMART Catapult). This example is very relevant to design decisions we often see in industry. The increasing pace of innovation presents challenges to safety critical system designs. As such, prior to implementation, it is necessary to carefully consider emerging technologies which offer great potential benefit, yet hold significant risk for injury to people, property and/or the environment. Ethically, safety should always take precedence. In the hypothetical scenario below we provide an opinion regarding the level of risk from autonomous systems technology that an organization should be willing to accept in order to derive its potential benefits.

Weighing the Risk of Autonomy in Safety Critical Systems

The ACME Corporation has directed its Integrated Research and Development group to begin developing autonomous systems. It has also asked that they determine the risk introduced by the technology that the organization should be willing to accept to derive its potential benefits. It has been decided that automation will be considered for the SMART Catapult. The motivation for this move is to position ACME as an industry leader in the design and development of automated Catapults.

An autonomous system is a persistent, goal-oriented computer program that reacts to its environment and runs without continuous direct supervision to perform some function for an end user or another program. They represent an evolutionary step beyond conventional computer programs. Autonomous systems can activate and run themselves without input from or interaction with a human user and can also initiate, oversee, and terminate other programs (Rouse, 2013).

Systems featuring autonomous operation are an emergent technology that garner both benefit and risk. Before settling on an objective decision it is necessary to understand the risks associated with autonomous systems. By way of this information, it can be ascertained if the proposed capability is a plausible fit for the system and ACME’s bottom-line.

The system is responsible for providing the convoy with the ability to launch the SMART Rock. A SMART Catapult has 8 launch baskets that allow for the launch of up to 8 SMART Rocks. The Catapult launches the SMART Rocks manually and does so by utilizing what is commonly referred to as a “man in the loop” operational approach. This approach requires an operator or operators to be involved with all the actions necessary to carry out a rock launch.

There are three manual operator actions required to execute the rock launch. Each of these actions are performed through an individual hardware switch (push button). The hardware switch controls the Catapult and its onboard software system that generates respective launch signals. There are three associated software functions tasked with generating the launch signals. Each launch signal is routed to the SMART Rock over a wire interface. The three operator actions are Standby, Launch, and Confirm Launch.

The hardware switches are safety-significant interlocks and reduce the risk of the system inadvertently initiating a launch. Inadvertent launch is possible but because of the coupling of software and hardware it is highly unlikely. To achieve an inadvertent launch either a total hardware failure or a combination of hardware failures and software malfunctions must occur.

The current manual launch implementation has a long standing and strongly accepted safety program that repeatedly receives design, implementation and test concurrence from the Safety Review Board. This is because each of the safety-significant software functions have been developed to be compliant with the System Safety Book and the Software Safety Book. Both documents provide specific details regarding the design, implementation, test and management of safety-significant systems and software.

Safety-significant systems that are designed to operate via manual controls typically require less rigorous safety programs than those that incorporate semi and fully autonomous functionality. The System Safety Book defines rigor as, a specification of the depth and breadth of software analysis and verification activities necessary to provide a sufficient level of confidence that a safety-significant software function will perform as required (DoD, 2012).

The combination of both hardware and software in the system design allows the safety team to more easily predict the failure modes of the system. This is because the reliability of hardware components can be measured and incorporated into fault trees that provide designers with a basis for determining the likelihood of an unsafe failure mode occurring. Unfortunately, the reliability of software cannot be predicted so easily because, software does what you tell it to do. In other words, it either works or it doesn’t.

It is true that software can become corrupt but this is a result of its hardware dependency. The likelihood of a bug being introduced into software can be predicted by observing the trends of the software’s problem reporting but that is a function of software quality. Therefore, those metrics should not be used as a variable for reliability. Based on this logic and rationale a system that removes safety-significant interlocks is introducing unpredictable failure modes and decreasing the system design teams understanding of how safe the operation of the system is. This in turn should motivated high rigor tasking.

The Benefits of Autonomy

Introducing autonomy to the Catapult would drastically reduce the number of operator actions required to launch 8 rocks. The most achievable and beneficial implementation of automation is an approach that closely resembles dominos. In an 8-rock launch sequence the Concept of Operations would require Rock 1 to be launched manually by an operator who selects Standby, Launch and Confirm Launch. Upon completion of the first rock launch the Catapult software would then automatically generate the before mentioned signals and transmit them to Rock 2. This process would then be repeated for each of the remaining rocks in the sequence.

Manual launch of 8 rocks requires a total of 24 operator actions. By leveraging the capabilities made possible by automation the number of operator actions would be reduced by 87.5%. Several benefits can be gained such as a reduction in human error and a decrease in hours of operation (which in theory increase the lifetime of each of the three switches operability). The greatest benefit of all is increased operator situational awareness. Scott Reichenbach, an assistant chief with the New Cumberland Federal Fire Department in Pennsylvania believes that situational awareness is a key concept in response, and command and control in any domain where the ever-increasing technological and situational complexity affects the human decision maker. Complete, accurate, and up-to-the-minute situational awareness is essential for responders and others who are responsible for controlling complex, dynamic systems and high-risk situations. Inadequate or completely absent situational awareness is cited as one of the primary factors in accidents attributed to human error (Reichenbach, 2009). The time saved though eliminating the launch of 8 rocks can be reallocated to other concurrent tasking that may be time critical or of safety-significance.

The Disadvantages of Autonomy

Clearly, measurable benefits are achievable if autonomy is integrated into the Catapult but by the same token there are many concerns surrounding such systems and the dangers they pose to human and environmental safety. Dr. Julie A. Adams and Sanford T. Freedman point out that there are many safety issues that these systems will pose.

One safety concern that ties back to a previously mentioned benefit is how autonomous systems are unable to accurately identify environmental elements and make the appropriate decisions based upon the situation. Humans have an innate ability to interpret their environment and determine the appropriate action to take to ensure mission success while minimizing the danger to friendly forces and innocent bystanders. Current autonomous systems do not have capabilities at the same level as their human counter-parts, thus raising concern regarding their ability to ensure safety and still be effective (Adams & Freedman).

If an inadvertent launch were to occur the mishap could be potentially catastrophic. An example of this kind of an event is a SMART rock igniting in a basket. The result would be a total loss of the Catapult. Such an event is unacceptable so it would be imperative if autonomous functionality were to be implemented in the Catapult that all necessary engineering controls be in place to ensure the likelihood of mishap occurrence was reduced to improbable.

This is possible through the usage of additional safety features that must be implemented in the system. The most notable safety feature is continuous system monitoring coupled with built-in redundancy. Continuous monitoring aims to empower system operators with a real-time intelligence of the status and performance of their devices. Through data analytics, it unlocks real-time insights into the slightest changes in operating conditions that can lead to changes that have the potential to result in safety incidents. This varies from traditional operational programs which are limited with their intermittent information and their limited real- time insights that often lead to late or missed alerts (Garg, 2015).

Final Thoughts

For the emerging technological advances of system autonomy and the concept of autonomous systems to be exercised in real-time safety critical systems the acceptable level of risk must be assessed as improbable. No other level of risk can be tolerated. Software’s unpredictable failure modes make it difficult for designers to ensure that the system performs as intended at all times. Additionally, the removal of key safety-significant interlocks questions the value of making such a change. Lastly, if the designers choose to exercise the strength of continuous system monitoring coupled with built-in redundancy the probability of an unsafe condition is significantly reduced. Although, for this feature to realize full benefit it must be able to immediately initiate a system override upon detection of anomalous behavior or incorrect conditions.

References

Adams, D. A., & Freedman, S. T. (n.d.). Unmanned System Autonomy, Situation Awareness, and System Safety. Nashville, TN: Vanderbilt University.

DoD. (2012). System Safety. Washinton DC.: Department of Defense (DoD).

Garg, D. A. (2015, October 2). How Continuous Monitoring Improves Safety. Retrieved from http://www.automationmag.com/opinion/machine-safety/5462-how-continuous-monitoring-improves-safety

Reichenbach, S. (2009, March 1). Situational Awareness: Key to Emergency Response. Retrieved from Fire Engineering: http://www.fireengineering.com/articles/print/volume-162/issue-3/features/situational-awareness-key-to-emergency-response.html

Rouse, M. (2013, January). Software Agent. Retrieved from What Is It: http://whatis.techtarget.com/definition/software-agent

--

--

Nicolas Malloy

AV System Safety Engineer | Passionate about Resilience Engineering and Data Science