Leaked Keys and Secrets: A Pressing Challenge In Application Security
A recent paper about leaked secrets on GitHub, presented by researchers at North Carolina State University in late February, has mostly flown under the radar. But this study shouldn’t be ignored. It applied the most rigorous methodology to date in scrutinizing leaked secrets in public code repositories, unveiling rich data about the scope of this problem.
The findings?
- Every single day, thousands of keys and secrets are leaked on GitHub alone.
- Over 100,000 repositories are affected.
- More than 80% of leaked secrets are not removed and stay visible for two weeks or longer.
Today’s post will take a deeper look at this study, what it tells us about the current state of application security, and how moving target defense (MTD) can help developers and their teams be part of the solution.
What Sets This Study Apart
That data security can be compromised on platforms like GitHub comes as no great shock. GitHub has long been aware of this problem and has taken steps to try to prevent passwords, authorization tokens, API keys, and other secrets from winding up in committed code.
How this study broke new ground, then, was in applying a thorough and systematic process for investigating the extent of leaks on Github. Its robust methodology included:
- Broader scanning: the authors queried code using both GitHub’s Search API and a snapshot taken by Google’s BigQuery database. Overall, they scanned billions of files for the presence of potentially sensitive information.
- Nuanced search: the researchers developed a list of distinct signatures — 11 for platforms and 15 for API services, many among the most popular and highest-risk if compromised — to hunt for leaked secrets.
- Refined analysis: files that potentially held secrets were subjected to multiple filters to eliminate false positives and to only identify code likely to truly reflect a disclosed key or secret.
- Tracked over time: files from the GitHub Search API were gathered over a 6-month timeframe, providing both a meaningful sample and an ability to monitor whether secrets were removed after being leaked.
In virtually all aspects of the study design, the authors invested the time and effort to improve the scope of their search and the accuracy of their findings.
Moreover, the methodology deliberately erred on the side of conservative results. This is laudable in an era in which research is often valued more for its “wow” factor than its academic rigor. It also serves as a potent and sobering reminder that the findings of this study — in themselves quite alarming — are likely just the tip of the iceberg.
Big Picture Findings
The major takeaways from this study relate to the sheer scale of the leaked secrets problem. Leaks are not happening just every now and again. By tracking months’ worth of data, the study shows that there is a hemorrhaging of private code to public repos every day.
Thousands of keys and secrets accidentally committed to GitHub were identified, and these leaks affect authentication tokens and API keys for services that are both widely used and that can have tremendous consequences in the event of a security breach. Most leaks were associated with single-owners, reflecting a higher propensity to contain truly sensitive information.
Once leaks occurred, the secrets typically remained on GitHub. While some were quickly retracted, over 80% of leaks identified in the study were still found on GitHub more than two weeks after they were first committed, indicating that current methods for finding and eliminating secrets after they are posted are not effective.
The magnitude of leaks identified by the study is eye-opening, but it is even more troubling considering that the study covered only 13% of open-source repositories and limited types of keys and secrets. There is no doubt that this study is just a partial insight into the mammoth nature of this problem.
Other Important Study Findings
The study is full of interesting nuggets, and it’s natural to be drawn to the specific examples of leaks, such as those that implicated the AWS credentials for a European government agency and a major website for U.S. college applicants.
What captured our attention, though, were findings that illustrated how other security efforts are falling short. For example, many keys are just one piece of multi-factor authentication, which theoretically offers defense against a leak. However, the study authors found that the presence of one leak dramatically increased the susceptibility of other secrets.
In addition, a well-known program like TruffleHog, which works to sniff out sensitive data from code, identified less than 30% of the leaks found in the study. While useful, it is wholly insufficient as a means of finding and eradicating leaks.
Lastly, the authors debunked the notion that these leaks resulted only from the work of inexperienced GitHub users. Analysis of the repositories with leaks showed that experienced developers and those with considerable GitHub activity were just as likely to commit these inadvertent disclosures.
Improving API and Key Security With Moving Target Defense
Managing the proliferation of keys and secrets is one of the most pressing challenges in DevSecOps and DevOps in general. While there is no silver bullet solution, moving target defense can play an important role in protecting these assets.
MTD changes the security dynamic by fragmenting, encrypting, and constantly morphing keys and secrets so that they are not a stationary target. CryptoMove’s Tholos key vault is a prime example of how this cybersecurity strategy can pay huge dividends in application security.
Preventing leaked secrets requires “changing the architecture of libraries to automatically and securely manage secrets for developers.”
Developers can store keys and secrets in the Tholos key vault and use the CryptoMove API as a placeholder in their source code. This means that actual keys never wind up being accidentally committed to GitHub.
The Tholos interface simplifies key management for developers. Rotating keys or revoking access can be done easily within CryptoMove and without having to scour through the source code. As a bonus, the Tholos key vault promotes DevOps best practices by including tools for organizing and sharing keys as well as setting schedules for key rotation and expiration.
Early access to Tholos is available now, and CryptoMove is continually working to build out new functionality. For example, a new feature in the pipeline is a tool that meticulously scans GitHub for secrets and automatically imports them into CryptoMove, rapidly eliminating vulnerabilities and promoting unified key management.
The authors of the research study conclude by saying that addressing the leakage of secrets will likely require “changing the architecture of libraries to automatically and securely manage secrets for developers.” The CryptoMove Tholos key vault provides developers with a tool to implement precisely that kind of change.