Ghost Secrets: Hidden threats in the codebase

Aqua Security found that credentials, API tokens, and keys added to code by developers could be exposed for years even after they were considered deleted.

译自 Phantom Secrets: The Hidden Threat in Code Repositories，作者 Jeffrey Burt。

In recent years, the increasing complexity of modern software development environments has led to a growing problem for programmers exposing secrets to their codebases, making them readily available to cybercriminals.

GitGuardian has been tracking this issue for a few years now, detailing in its annual State of Secret Spread Report that the number of exposed secrets discovered on GitHub is climbing each year. The latest report, released this year, shows that the company detected nearly 12.8 million new secrets in GitHub commits in 2023, an increase of nearly 3 million from the previous year.

In 2020, in the first year of reporting, that number was 3 million. Of the 1.1 billion commits scanned by GitGuardian in 2023, 8 million have exposed at least one secret.

Fears were heightened last week by researchers from Aqua Security, who said they had discovered secrets — API tokens, credentials, and passwords — that had been exposed for years. They also found that hardcoding a secret into the code once, even if it is considered deleted, can expose it permanently.

Even more worrying: Most scanning methods miss these "ghost secrets," and researchers have found that nearly 18% of secrets in Git repositories may be overlooked.

"We uncovered important secrets, including the credentials of the cloud environment, internal infrastructure, and telemetry platforms that were exposed to the internet," Yakir Kadkoda and Ilay Goldman, researchers at Aqua Nautilus, Aqua's security division, wrote in a report. "Through various Git-based processes whose impact on developers and AppSec professionals is unclear, and the behavior of source code management (SCM) platforms, secrets remain exposed even after being considered removed."

The developers and their secrets

For years, developers have been hardcoding secrets into software for faster configuration and other legitimate purposes. Today, with the rise of cloud computing and the increasingly complex and fragmented nature of programming, open source and third-party code and code reuse have become the norm, making programming faster, but also exposing it to increasing cyber threats and supply chain risks.

Countless security vendors have sounded the alarm about revealing secrets, with Kadkoda and Goldman writing that they have been "educating developers not to hardcode secrets into their code" for years. In addition, the global secret management software market is expected to grow rapidly, with one forecast stating that the market will grow from $67 billion last year to $104.6 billion by 2031.

The ghost secret problem is largely due to the way SCM systems (such as GitHub, Bitbucket, and GitLab) hold deleted or updated code commits in their Git-based infrastructure, according to the Aqua Nautilus team. This means that even secrets that have been used once in the code, or that are thought to have been deleted, may still be exposed.

To write the report, Aqua researchers scanned the top 100 organizations on GitHub, which includes more than 52,000 publicly available repositories.

"In the course of our research, we uncovered some big secrets, including gaining access to the full cloud environment of some of the world's largest organizations, infiltrating the internal fuzzing infrastructure of sensitive projects, gaining access to telemetry platforms, and even network devices, Simple Network Management Protocol (SNMP) secrets, and camera footage from Fortune 500 companies," Kadkoda and Goldman wrote. "These findings could cause significant attacks on the affected organizations."

Mozilla and Cisco as cautionary tales

In one case, researchers discovered an API token for Mozilla's FuzzManager, an internal tool for collecting and analyzing fuzz testing data for security vulnerabilities. The token gives them access to Mozilla's internal fuzzing data, which is often kept secret to prevent malicious actors from exploiting unpatched vulnerabilities. In another case, they discovered a privileged API token for the Cisco Meraki dashboard, which allows organizations to manage their networks. An attacker who finds such a token can take control of network resources and access sensitive information, including SNMP secrets and camera footage.

In another case, they discovered an Azure service principal token in a Git commit at a large healthcare company. The token gives the holder high level of access to the company's Microsoft Azure resources, including its internal Azure Kubernetes Service and Azure Container Registry. A malicious actor in possession of the token could take control of the company's Kubernetes cluster.

All organizations that exposed the secret were notified, and the secret has been revoked.

Still, the question of ghost secrecy remains. Aqua scanned the repository using two tools – git clone and git clone –mirror – and found that they missed nearly 18% of the secrets in the mirrored versions of the repository. The problem is that commits can still be accessed through the "cached view" on SCM, so any secrets removed from cloned and mirrored versions of the repository can still be accessed by anyone who knows the commit hash.

Get a cached view

The researchers outlined four strategies for retrieving cached view commits, ranging from brute-force commit hashes and using REST API endpoints to viewing the GUI for pull requests and using GitHub historical datasets.

Cybersecurity experts say organizations will have to address ghost secrets if they want to stop cyber risks to developers.

Eric Schwake, director of cybersecurity strategy at Salt Security, told The New Stack: "This issue is critical because it points to a fundamental flaw in the way secret is managed in Git-based systems, which affects many organizations. Exposing secrets such as API tokens and credentials can lead to serious consequences, such as unauthorized access, data breaches, and financial losses. Even after deletion or updating, the persistence of 'ghost secrets' can exacerbate the problem, posing a long-term risk. Because APIs are the foundation of modern applications, they are becoming a target for attackers. ”

Sarah Jones, Research Analyst for Cyber Threat Intelligence at Critical Start, says organizations will need to take a multi-layered approach to mitigating such risks. Jones told The New Stack: "Developers need to be thoroughly trained on secure coding practices, proper secret management with specialized tools, and the need to prevent accidental leaks. Automated scanning tools can identify secrets before they are pushed to public repositories, adding an extra layer of security to the code review process. In addition, organizations should implement specialized secrets management solutions to ensure secure storage and fine-grained access control. ”

Malicious actors like developers

Both Schwake and Jones say that developers will continue to be tempting targets for threat actors because they have access to sensitive information and systems, and that the attack surface will expand due to the increased use of open-source code and cloud-native development. In addition, as DevSecOps practices are integrated into the development lifecycle, attackers will continue to shift their focus to exploiting vulnerabilities in the development process itself, Schwake says.

"However, the situation is gradually improving," he said. "As security breaches become more frequent and their impact becomes more severe, developers are beginning to recognize the importance of security. Organizations should invest in security training programs and integrate security tools into development workflows. Adopting DevSecOps practices can also help foster a culture of shared responsibility for security, encouraging developers to take responsibility for security at work. ”

He adds, "We're also seeing an increasing importance of posture governance throughout the API development lifecycle to prevent security issues as early as possible." ”

Ghost Secrets: Hidden threats in the codebase

The developers and their secrets

Mozilla and Cisco as cautionary tales

Get a cached view

Malicious actors like developers