visit
Secrets in version control systems (VCS) like git is the current state of the world despite widely being recognized as a bad practice. Once source code enters a git repository, it can organically spread into multiple locations. This includes any secrets that may be included within. But why then are secrets in git repositories so common.
A seasoned developer may be scratching their heads wondering why anyone may put secrets inside a git repository. But the fact is, secrets inside git repositories is the current state of the world.
Previously we have discussed why it is common to choose the path of least resistance when it comes to accessing and distributing secrets. Git acts as the central point of truth for a project, so it makes sense, at least from a convenience point of view, that secrets are stored inside a private git repository to make distribution and access easy.
But storing secrets like this is playing with fire, it only takes a very small incident to get burnt.
In addition to intentionally storing secrets in git, when secrets are not managed properly, it is very easy to lose track of them. Secrets may be hardcoded into source code, stored as text file, shared over Slack or buried inside a debug application log. In addition, developers can be in large distributed teams with access to a plethora of secrets while being faced with reduced release cycles and an ever growing number of technologies to master.
Projects can be cloned onto multiple machines, forked into new projects, distributed to customers, made public so on and so forth. Each time it’s duplicated on git, the entire history of that project is also duplicated.
Why storing secrets in public repositories is bad will be obvious. They are freely available to everyone on the internet and it is very easy to monitor public repositories, GitHub has a public API to fetch all public commits for example.But what about private git repositories?
Private repositories don’t publish your source code to the internet openly, but it doesn’t have adequate protection to store such sensitive information either. Imagine if there was a plain text file with all your credit card numbers within it, you hopefully wouldn’t put this into the companies git repository, secrets are just as sensitive.
A few things to consider when storing secrets in private repositories:
Everyone in the organization with access to the repo has access to the secrets within (one compromised account can provide an attacker access to a trove of secrets).Repositories can be cloned onto multiple machines or forked into new projects.Private repositories can be made public which can have secrets buried in the git history.Another important consideration is that code removed from a git repository is never actually gone.
Git keeps track of all changes that are made. Code that is removed - or more technically correct: code that is committed over - still exists within the git history.Interestingly enough, code is removed from a project at near equal volume that is added. This means that the code within repositories are much deeper than the first layer and secrets could be buried deep within the git history under a mass of commits that have been long forgotten.git contributions graph
Comment: The contributions graph that you see above from HashiCorp Vault repository is a typical view of a project's history. The regularity you find in project contribution graphs is both surprising and interesting (check out some projects graphs, it seems to be a rule of nature).
If you perform a search on GitHub for the commit message ‘removed aws key’, you will find thousands of results. And that's just within public repositories.
Publicly disclosed examples of recent data breaches through leaked credentials.
If this seems like an issue for only large companies to worry about, it’s not. Attackers are constantly exploiting personal services through secret keys too. In one example, bad actors scanned GitHub for AWS keys and used them to mine cryptocurrency, leaving .
Code reviews are great for detecting logic flaws, maintaining good coding practices and keeping code quality high. But they are not adequate protection for detecting secrets.
This is because reviews generally only consider the net difference between the current and proposed state. Not the entire history of a branch. Branches are commonly cleaned before being merged into the master branch, temporary code is added then deleted, unnecessary files added then removed.. But now these files, which are high risk candidates for containing secrets, are not visible to the reviewer (unless they want to go through the entire history of a branch).
Also published .