Quantitative measures for Safety-Critical Autonomous Systems: A quick primer

Autonomous systems increasingly prevalent across various sectors — including transportation, healthcare, and manufacturing — to augment humans in performing dull, dirty and dangerous tasks. AI/Autonomy Engineer, Perrie Lim, summarizes her research in a quick primer on quantitative measures to evaluate their performance and effectiveness for safety-critical deployments such as search & rescue, humanitarian assistance/disaster relief.

Published in

d*classified

8 min readNov 1, 2023

Introduction — Learning to Test; Testing to Learn

Testing and evaluating autonomous systems come with significant challenges. One major hurdle is the limited resources available for their development and testing due to budget and schedule constraints. These systems require thorough testing and verification to ensure they are reliable and safe. However, these processes can be resource-intensive, and when resources are scarce, it may lead to selective testing, inadequate understanding of the System’s Operating Design Domain (ODD).

Another challenge arises from deployment under complex environments, which may include terrain changes, unfavorable weather conditions, and even communications interference. While testing & evaluation of non-deterministic learning-based systems remains an open challenge, we outline several quantitative measures that could help system evaluation based on technology and processes that are available today.

Eyes on the goal — towards assured autonomy

Mission Success Metrics

This category provides an elevated view of the system’s exploration and navigational capabilities (these are defined as elements contributing to mission success for a search & rescue or HADR mission).

Positional Accuracy: Refers to the disparity between the robot’s perceived pose output and its actual physical location based on ground truth. A minimal deviation indicates high accuracy, crucial for tasks demanding precise navigation or alignment [5] — [13] such as precision landing on a small launch and recovery site, or delivery of aid packages to disaster survivors on a rooftop.
Reliability and Repeatability: Measures the robot’s consistency in task execution. It’s not just about doing a task right once but ensuring it’s done right every single time. Consistent performances, even in varying conditions, indicate a robust and reliable system [14], [15].
Quality of Information Gain: Assesses the system’s efficiency by considering the time or energy resources expended to capture a comprehensive dataset from its surroundings. It’s about maximizing information retrieval while minimizing resource consumption [14], [16] — [18].
Path Planning: Focuses on the robot’s decision-making acumen. The goal is for the robot to determine the most optimal route in real-time, considering obstacles, mission objectives, and efficiency. It’s not merely about reaching the destination, but doing so in the most efficient manner [5], [19], [20] (e.g. least amount of time-steps).

Robot path planning splines (purple branches) towards goal (purple sphere)

Exploration Efficiency: Evaluates how swiftly the robot can survey unfamiliar terrains and generate accurate maps. This metric places a premium on speed without compromising the quality of the environmental understanding [7], [19] — [23].

These metrics play a key role in comprehending how adeptly the system accomplishes its predefined mission objectives.

Robustness Metrics

Robustness metrics are instrumental in assessing the system’s resilience and adaptability in dynamic and unpredictable environments [4], [15], [24]. Referring to a robot’s adaptability in diverse and changing conditions, the robustness can be assessed as:

Perturbations Tests: Evaluations of positional and detection accuracy under scenarios like Dynamic Obstacles, Scenery Alterations, Navigation in Feature-Scarce Terrains, and varying or deceptive sensor inputs [25], [26].

Time synch and sensor/signal related metrics like:

Anticipatory actions like Time Delay and Trajectory adjustments [15], [19], [27].
The congruency of real-time and pre-set maps (Map Congruence) [26]
Spurious detections and signals, for example, Landmarks falsely identified by sensors [8].
The system’s ability to reorient itself (Relocalisation Capability) [5], [26].
The accuracy of data interpretation, encapsulated in metrics like Absolute, mean, and variance of Estimation Error [2], [8], [28].

Impact Assessment: Observations under the lens of mission completion rate, and the variance in completion times between perturbed and unperturbed states.

Example of Quantified Robustness: Scoring based on combined key metric indicators [26]

The assessment of robustness is a critical component for gauging how well the system executes its mission in complex scenarios.

System Resource Expenditure and Constraints

Efficient resource management is fundamental to achieving peak performance [4]. This category scrutinizes metrics such as processor and memory usage, system throughput, real-time jitter, latency, and round trip time, ensuring resources are allocated effectively.

Processor and Memory Usage: Focuses on Computational Time — the speed at which data is processed and transformed into actionable insights. It also evaluates how efficiently the system gathers valuable data within constrained time or energy parameters [7], [16], [25], [25], [29], [30].
System Throughput: A critical metric for understanding performance, especially when determining the appropriate level of control decentralization. Simulation modeling remains a preferred tool for these evaluations [29].
Real Time Jitter: This metric provides insights into asymmetries between the sending and receiving frequencies of data packets, thereby gauging the real-time capabilities’ balance of the system components [31].
Latency: Measures the interval between message generation in one component and its subsequent reception in another, revealing system responsiveness [31].
Round Trip Time: Represents the duration from dispatching a message to receiving acknowledgment, highlighting system efficiency and feedback loop speeds [31].

Selecting the right metrics goes beyond the scope of system-level metrics, venturing into a more detailed examination of individual components. It is a critical decibsion that serves a dual purpose: optimizing system performance and assessing component-level proficiency. For example, component-level metrics such as ‘Curvature Change’, ‘Average Distance to Obstacles’, and ‘Exploration Rate’ are essential in system-level domains like Path Planning and Exploration Efficiency. Tools like ‘Throughput Time’ and ‘Makespan’ measure these metrics, ensuring a thorough evaluation of the system and its parts.

In summary, understanding and using these performance metrics are key to refining autonomous exploratory systems. For both newcomers and experts, these metrics serve as benchmarks in a field where measuring robot system alignment with specifications is challenging. Despite various performance indices over time, defining system limits remains intricate, emphasizing the need for careful metric application and comprehension.

References

[1] Fong, T. W., Frank, J. D., Badger, J. M., Nesnas, I. A., and Feary, M. S. (2018). Autonomous Systems Taxonomy. Tech. rep. Washington, DC: NASA.

[2] James S. Albus, Metrics and Performance Measures for Intelligent Unmanned Ground Vehicles. In Proceeding of the performance Metrics for Intelligent System Workshop, 2002

[3] Chang, P.H.: A Dexterity Measure for the Kinematic Control of Robot Manipulator with Redundancy, p. 52. MIT (1988) Current gap in understanding the performance metrics for autonomous exploratory robots

[4] C. I. Schlenoff and Z. Kootbally, “Standards and performance metrics for on-road automated vehicles workshop | NIST,” 2023.

[5] “Performance Evaluation for Autonomous Mobile Robots,” in Proceedings of the 5th International Conference on Agents and Artificial Intelligence, 2013.

[6] C. Harris, R. Evans, and E. Tidey, “Assessment of a visually guided autonomous exploration robot,” in Unmanned/Unattended Sensors and Sensor Networks V, 2008.

[7] F. Chen, J. D. Martin, Y. Huang, J. Wang, and B. Englot, “Autonomous exploration under uncertainty via deep reinforcement learning on graphs,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020.

[8] F. Lotfi and H. D. Taghirad, “A framework for 3D tracking of frontal dynamic objects in autonomous cars,” Expert Syst. Appl., vol. 183, no. 115343, p. 115343, 2021.

[9] R. Kümmerle, B. Steder, C. Dornhege, M. Ruhnke, G. Grisetti, C. Stachniss, and A. Kleiner, “On measuring the accuracy of slam algorithms,” Autonomous Robots, vol. 27, no. 4, p. 387, 2009.

[10] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A benchmark for the evaluation of rgb-d slam systems,” in Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, pp. 573–580, IEEE, 2012.

[11] A. PÅLSSON and M. SMEDBERG, “Investigating simultaneous localization and mapping for agv systems,” Master’s thesis, Chalmers University of Technology, 2017.

[12] H. Strasdat, J. Montiel, and A. J. Davison, “Real-time monocular slam: Why filter?,” in Robotics and Automation (ICRA), 2010 IEEE International Conference on, pp. 2657–2664, IEEE, 2010.

[13] D. Schroeter and P. Newman, “On the robustness of visual homing under landmark uncertainty,” in Intelligent Autonomous Systems, vol. 10, pp. 278–287, 2008.

[14] C. Nicholas, C. Fung, J. M. Nieto-Granda, and J. G. Gregory, Autonomous Exploration Using an Information Gain Metric

[15] A. Gerstenberg and M. Steinert, “Evaluating and optimizing chaotically behaving mobile robots with a deterministic simulation,” Procedia CIRP, vol. 84, pp. 219–224, 2019.

[16] D. Calisi, A. Farinelli, L. Iocchi, and D. Nardi, “Multi‐objective exploration and search for autonomous rescue robots,” J. Field Robot., vol. 24, no. 8–9, pp. 763–777, 2007.

[17] Y. Sun and C. Zhang, “Efficient and safe robotic autonomous environment exploration using integrated frontier detection and multiple path evaluation,” Remote Sens. (Basel), vol. 13, no. 23, p. 4881, 2021.

[18] S. Bai, F. Chen, and B. Englot, “Toward autonomous mapping and exploration for mobile robots through deep supervised learning,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.

[19] D. Calisi and D. Nardi, “Performance evaluation of pure-motion tasks for mobile robots with respect to world models,” Auton. Robots, vol. 27, no. 4, pp. 465–481, 2009.

[20] Design Methods for Cost-Effective Teams of Mobile Robots in Uncertain Terrain by Nathaniel Steven Michaluk. .

[21] R. H. Kabir and K. Lee, “Efficient multi-robot exploration with energy constraint based on Optimal Transport theory,” arXiv [eess.SY], 2020.

[22] H. Carrillo, P. Dames, V. Kumar, and J. A. Castellanos, “Autonomous robotic exploration using a utility function based on Rényi’s general theory of entropy,” Auton. Robots, vol. 42, no. 2, pp. 235–256, 2018.

[23] H. Ardiny, S. Witwicki, and F. Mondada, “Autonomous exploration for radioactive hotspots localization taking account of sensor limitations,” Sensors (Basel), vol. 19, no. 2, p. 292, 2019.

[24] A. Lampe and R. Chatila, “Performance measure for the evaluation of mobile robot autonomy,” in Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006, 2006.

[25] “How to evaluate SLAM performance (for autonomous mobile robot applications),” Kudan global, 30-May-2022. [Online]. Available: https://www.kudan.io/blog/evaluate-slam-performance-for-autonomous-mobile-robots/.

[26] P. Li, C.-Y. Yang, R. Wang, and S. Wang, “A high-efficiency, information-based exploration path planning method for active simultaneous localization and mapping,” Int. J. Adv. Robot. Syst., vol. 17, no. 1, p. 172988142090320, 2020.

[27] F. Niroui, “Developing an intelligent robot control architecture for autonomous exploration in urban search and rescue,” 2018.

[28] E. Olson, J. Strom, R. Goeddel, R. Morton, P. Ranganathan, and A. Richardson, “Exploration and mapping with autonomous robot teams,” Commun. ACM, vol. 56, no. 3, pp. 62–70, 2013.

[29] G. Fragapane, R. de Koster, F. Sgarbossa, and J. O. Strandhagen, “Planning and control of autonomous mobile robots for intralogistics: Literature review and research agenda,” Eur. J. Oper. Res., vol. 294, no. 2, pp. 405–426, 2021.

[30] A. Dai, S. Papatheodorou, N. Funk, D. Tzoumanikas, and S. Leutenegger, “Fast frontier-based information-driven autonomous exploration with an MAV,” arXiv [cs.RO], 2020.

[31] M. Testouri, G. Elghazaly, and R. Frank, “FastCycle: A message sharing framework for modular automated Driving Systems,” arXiv [cs.RO], 2022.

Quantitative measures for Safety-Critical Autonomous Systems: A quick primer

Introduction — Learning to Test; Testing to Learn

Mission Success Metrics

Robustness Metrics

System Resource Expenditure and Constraints

Written by d*classified