Retrieving, Storing and Querying 250M+ Certificates Like a Boss

Ryan Sears
May 17, 2017 · 6 min read
Pictured Above: Google’s BigQuery infrastructure hard at work.

Lets harvest some certificates!

Enter the Axeman

With a logo this chique, you’d think this was written in ECMAScript 6

To the cloud!

Now THAT’S being a noisy neighbor

Where to put all this data?

The Ingest

gsutil -o GSUtil:parallel_composite_upload_threshold=150M \
-m cp \
/tmp/certificates/* \
gs://all-certificates
Absolutely note the two options I’ve set at the bottom!
Now that’s a spicy dataset! 👨‍🍳

On to the fun part!

15.5 seconds to munge 272M+ rows? Yes please!
2 seconds to get this data is *ludicrous* speed

A bit of a head-scratcher

That’s 91,246 submissions spread among every CTL we watch 😱

Lastly, an offer to the community

Cali Dog Security

A small software company based in the heart of silicon valley with the aim to make security products hassle-free and ubiquitous. Focusing on a strong user experience and quality engineering, we build tools that solve problems no-one else has tackled before.

Ryan Sears

Written by

Founder of Cali Dog Security & builder of things.

Cali Dog Security

A small software company based in the heart of silicon valley with the aim to make security products hassle-free and ubiquitous. Focusing on a strong user experience and quality engineering, we build tools that solve problems no-one else has tackled before.