I originally was interested in S3 Buckets because I continually saw all the places getting breached and suffering data leaks. I thought to myself "how cool would it be to find one of these and report it ?!" A noble idea at the time. I then realised that there were few resources to actually just find buckets. Sure, there were tools to collect the names, but few place to actually search through discovered buckets.
That all led to my searchable list of S3 Buckets. I then realised that several other researchers felt the same way. Sure, they were collecting names and going through them, but there wasn't really many searchable places for S3 Buckets (think about Shodan). So, there we have it, a crash course on why I do what I do.
To newcomers of the Amazon S3 Bucket nonsense, you may be wondering how the hell those of us are even getting all these bucket names; I get this question quite frequently.
The oldschool method. :D It's oldschool, however, you can still find a lot of buckets with this method if you have a good list. DigiNinja (https://twitter.com/digininja) has a really slick tool written in ruby that can be used to not only burtefoce but to also plunder buckets.
Bucket Finder: https://digi.ninja/projects/bucket_finder.php
There are several other bucket bruteforcing tools out there, however, DijiNinja's tools has worked fine for me.
Yet another method to find buckets is to use a scraper to read though website source to inspect requests going to cloudfront (Amazon's CDN) or directly to S3. I've been working on my own, and there's even a POC by Random_Robbie (https://twitter.com/Random_Robbie/) out on GitHub (written in Go).
* on a side note, I'm also inspecting http and DNS requests going through my home network to see if there are any S3 requested resources.
Certstream is probably one of the biggest methods to grabbing S3 Buckets. So, what the heck is it? To quote the Certstream site (https://certstream.calidog.io):
Real-time certificate transparency log update stream.
See SSL certificates as they're issued in real time.
One can easily collect a few thousand S3 Buckets in only a few minutes. So, what tools are there to collect S3 Buckets from Certstream?
On a side note, several researchers use Certstream events to find possible phishing sites. ;)
So now you know some of the tools and methods to collect/gather some S3 Buckets.
Putting it together
So, collecting buckets is awesome. One of the biggest issues that I encountered, however, was how the heck was I supposed to do anything useful with a massive list of bucket names. This is where I started monkeying with SQL servers and the Python Flask API libary. With these two things, I was able to build a system to view and actively query buckets that I was collecting.
There were, of course, several failures when deploying...
If you're wondering, the searchable site (https://protoxin.net/s3/) is just using DataTables with a backend AJAX request to my api server.
Again, these are all things that aren't 100% difficult. It just took some time to build things out and make sure it didn't just explode it a blaze of glory.
While having all these buckets is great, it doesn't really help researchers if they cannot see if buckets are world readable. Note, world readable and authenticated world readable are two VERY different things. Regardless, I was frustrated that, even for my own research, checking permissions was being done with other tools and wasn't exactly what I wanted. This led to me making my first [public] tool that allows one to pull permissions from one list, ensure values are unique, and then output them to another list with whether or not the bucket is world readable.
This tool does require you to have a valid AWS API Keypair.
Things To Consider
You will be issuing a lot of DNS requests...a lot.
The other thing to consider when pursuing this type of research is that you may find sensitive data. Consider responsible disclosure, please. Better to do some good than a lot of evil.
What are some examples of things I've found and reported?
- Voter information
- Customer db backups
- Election system info
- Email lists
- Private keys (SSL and SSH)
Again, please responsibly report things you find. I'm not expecting everyone to to good, there are always bad people with malicious intent. One must secondly remember that putting all your stuff on a public facing data store isn't smart if you first haven't done any type of threat modeling and permission auditing.