With high profile attacks becoming a regular part of the news cycle (see Yahoo, Target, Sony...), security is an increasingly relevant consideration for anyone developing a product.
Unfortunately, security is often more about what we don’t think of than what we do, and it’s important to cover as many bases as possible.
A trend has emerged recently of scraping sites such as GitHub for sensitive information, such as passwords, access keys, and databases. This method of stealing information has become a surprisingly effective tactic. You may be wondering why people would upload their passwords to a public site, and the answer is surprisingly simple. They are unaware they’re doing it. To find out how big this problem is, we did a bit of scraping ourselves…
How widespread are Git security issues?
We took a sample of data from the public GitHub stream to get an idea of the scale of this issue. Out of the 78.3 thousand commits we checked, 62 of them matched sensitive file patterns. Though this may not sound like a lot (a mere 0.07%), when scaled, the impact is quite large.
Given data from today alone from the Github Archive, there was 459,991 push events. If you apply that 0.07% to these numbers, it equates to roughly 322. That’s over 300 databases logins, servers credentials, and SSH private keys becoming public information each day! Note also that we’re talking commits here, not projects.
With many developers pushing code several times per day, proper understanding and usage of gitignore is essential to protecting company secrets — the easiest way for criminals to break in is for you to hand them the keys. Generally speaking configuration files that contain passwords, keys, and similar info should not be public. Luckily, there is a solution!
The gitignore file was created for the purpose of preventing files from being uploaded without needing to explicitly exclude them. Any file added to the gitignore will never be included in git commits. Not only does this feature allow for system-specific files to be untouched, but it allows for insurance that sensitive files will never be uploaded. Let’s take the following directory as an example:
If we wanted to exclude the file, “example.txt”, we would simply create a file, “.gitignore”, containing this line:
Easy, right? If we wanted to exclude all text files, we would simply add the line:
Each line pertains to a specific file or set of files to exclude. Here’s an example of a full fledged gitignore file (specifically one for ruby):
## Documentation cache and generated files:
## Environment normalization:
There are quite a few other useful features of the gitignore, such as directory removal or file whitelisting — we’ve included a few helpful links below if you’re interested in some of the more advanced gitignore functionality.
Before adding a gitignore file to your project, it’s worth checking to see if one already exists. This is important even if you’re working solo, because many services and libraries will come pre-loaded with gitignore files included. Once you’ve verified that you need a fresh one, you can use gitignore.io — a great tool for finding or generating gitignore. This tool will give you a baseline gitignore, to which you can add important files or remove rules you’re not using.
Here are a few additional resources if you’re interested in learning more: Gitignore documentation: https://git-scm.com/docs/gitignore Power user CLI tools to make gitignoring easy: https://github.com/joeblau/gitignore.io/wiki/Advanced-Command-Line-Improvements A collection of useful gitignore templates: https://github.com/github/gitignore
If you’re interested in doing some digging yourself on what sorts of things aren’t being gitignore-d, Here’s a link to the code we used to do our own digging, along with a readme that shows how to extend this for your own personal pattern matching. Happy hunting!
5 keys to successful organizational design
How do you create an organization that is nimble, flexible and takes a fresh view of team structure? These are the keys to creating and maintaining a successful business that will last the test of time.Read more
Why your best tech talent quits
Your best developers and IT pros receive recruiting offers in their InMail and inboxes daily. Because the competition for the top tech talent is so fierce, how do you keep your best employees in house?Read more
Technology in 2025: Prepare your workforce
The key to surviving this new industrial revolution is leading it. That requires two key elements of agile businesses: awareness of disruptive technology and a plan to develop talent that can make the most of it.Read more