Stopping cyberattacks in their tracks
As computers become more and more pervasive, it’s readily apparent how much software ecosystems affect our lives. Throughout the last decades, we have seen our connected lives evolve from the snail’s pace of dial-up modems to high-speed, always-on-and-listening devices connected to the cloud. Yet nothing seems to come without a cost: In this always-on, real-time data world, computer scientists have been drawn into an arms race against hackers. And while we have made great strides in cybersecurity innovation, the hackers continue to adapt.
The hackers of yesteryear would attack targets by exploiting vulnerabilities on servers and computers. As computer systems were hardened and cybersecurity practices matured, hackers were forced to move to a different target: the software supply chain.
Similar to a physical supply chain — the network of connected organizations, resources and information that produce the everyday goods of life — a software supply chain comprises the interconnected pipelines that process and transfer resources to create software products. In a software supply chain, raw materials like source code and machine learning models are transferred between vendors, which alter, extend, compile and test them to develop the final software product — such as a smartphone application — that consumers can use.
Hackers have found that while it is hard to hack an individual’s computer, they are better able to poison software supply chains in order to introduce malicious software to consumers’ devices. This tactic has affected companies of varying degrees of scale — from the BBC, through the “event-stream” compromise (a tool that allows developers to manage data streams); to major sectors of the U.S. Government; and through the SolarWinds hack, considered the most sophisticated cyberattack in history. Such software supply chain hacks are so effective that they have been increasing, with an alarming rise of more than 500 percent in the last three years.
My goal, to put it bluntly, is to stop these attacks. My research group at Purdue uses techniques ranging from applied cryptography to computer systems engineering, data science, trusted hardware, and regulatory/legal compliance enforcement. We link these multidisciplinary approaches to build systems to track the highly-interconnected software supply chain. We employ data science techniques to better understand the flow of software “artifacts” — byproducts of the software development process — and to build ways to enforce best practices and compliance to stop malevolent software from entering supply chains and affecting consumers. This is of particular importance to high-assurance ecosystems, such as the power grid, or the International Space Station.
My work in the field is highly practical, and I can proudly say that you may be protected by it while you’re reading this blog post. I collaborate with such software vendors as Google and Microsoft, as well as open-source communities like the Linux Foundation and Reproducible Builds, to create verifiable software development practices that everybody can use.
One example is in-toto, the first software supply chain security framework, which I helped develop. in-toto is an easy-to-use, open-source tool that allows software consumers to have a transparent understanding of how software is produced across the multi-step software supply chain, to help ensure that attackers have not inserted malicious code or otherwise compromised the software.
You can think of in-toto as the label on a bottle of juice that allows you to see which, if any, ingredients you may be allergic to, or if the ingredients match your expected dietary aims or restrictions. Also, as the expiration date or an FDA seal of approval helps you determine whether to trust a brand of juice, in-toto helps you decide whether to feel confident in the trustworthiness and integrity of the software.
Beyond in-toto, we are developing ways to collect information, build new data structures, and host internet infrastructure to understand the actors in the supply chain. Such is the case of the sigstore project, a Linux Foundation project founded by Google, Red Hat and Purdue University.
sigstore hosts information about various software supply chain actions in what is called a “tamper-evident ledger” — a tamper-proof public log, similar to a blockchain, that contains a secure list of transactions and records, so consumers and regulators can query information about the software being created to verify its origin and authenticity.
With sigstore, Purdue’s supply chain monitor provides ways to track and understand the evolution of the software ecosystem, and can generate recommendations about software choices — as well as warn individuals of possible supply chain malfeasance.
Santiago Torres Arias
Elmore Family School of Electrical and Computer Engineering
College of Engineering