SSL monitoring with elasticsearch

Like I explained in one of my previous article, we have to manage a lot of domains, and each of these domains contain a lot of records.

HTTPs represent a certain level of security for our users, and because our users privacy is really important for us, we started deploying all our website with HTTPs enabled by default. It took us a while as we had a few modifications to implement here and there in our code. We also needed to setup HTTPs for our videos delivery network and we didn’t want to impact the user experience too much.

Over the last two years, Google also started giving more importance to websites with HTTPs enabled. With time, we released more and more websites with HTTPs.

With so many websites on production, we needed a tool to monitor all our certificates. When will they expire, did we deploy the new certificate on all the endpoints, is the setup secure, etc…?! With thousands of records, automation was required.
There is a lot of websites out there that offer certificate monitoring, and we could have just used one of these. But we wanted something really simple, cheap and efficient at the same time. Also, the Qualys SSLLabs Test is one of the best tool we can find to verify that any HTTPs setup is done right.

Qualys has an API for their tool, and they even provided an easy-to-use golang cli tool to query it: https://github.com/ssllabs/ssllabs-scan
We used this tool, to push the json output to an elasticsearch cluster and the only thing left was to create a few visualizations on Kibana.

Expiry of our certificates over time
Endpoint grade repartition

This is two examples of the kind of data you can get from Qualys SSL test system. 
Their API has a rate limiting logic but the golang tool has everything builtin to not overload the API and follow the guidelines of the rate limiting system.

The only thing you need, is a list of all your records in a file and you can then trigger the processing.


Continue reading to get the extra details of our continuous monitoring system

  • Step 1 — Extract a list of all our records

We use OctoDNS to manage all our zones, so it was easy to get a list of all our records. We added a CSV exporter to OctoDNS to retrieve all the records.

octodns-export --config-file config/production.yaml --output-dir csv/ config

Details:
- octodns-export: this is our new CSV exporter
- config-file: path to the main octodns configuration file with the list of all the zones and providers
- output-dir: where should it save the CSV files. OctoDNS will export one CSV file per zone
- config: this last part tells OctoDNS to retrieve the data from the config files. We could have specified one of our providers here, but as all zones are not configured in all the providers, the best way to get everything is from the config files.

zone,type,record,ttl,value,geo,healthcheck
pornhub.com.,MX,.pornhub.com.,3600,"{u'preference': 10, u'exchange': 'smtp.pornhub.com.'}",,
pornhub.com.,TXT,default.pornhub.com.,3600,Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed at arcu molestie augue viverra porttitor ut ac odio.,,
pornhub.com.,A,blog.pornhub.com.,3600,111.111.111.111,,
pornhub.com.,CNAME,www.pornhub.com.,10800,pornhub.com.,,
pornhub.com.,CNAME,pl.pornhub.com.,10800,pornhub.com.,,
pornhub.com.,CNAME,de.pornhub.com.,10800,pornhub.com.,,
pornhub.com.,CNAME,cz.pornhub.com.,10800,pornhub.com.,,
pornhub.com.,CNAME,help.pornhub.com.,10800,.pornhub.com.,,
pornhub.com.,CNAME,es.pornhub.com.,10800,pornhub.com.,,
pornhub.com.,A,.pornhub.com.,3600,111.111.111.111,,

From that CSV, we export all the records. We also want to replace any wildcard with a word as it would make things easier for Qualys. We exclude any non-relevant records as localhost, ftp, dev, mail and a bunch of others.

#!/bin/bash
find csv/ -type f | xargs -n1 -P4 awk -F',' '{gsub("*","star",$3); if (($2 == "A" || $2 == "CNAME" || $2 == "AAAA") && $3 !~ /stage|origin|ftp|sql|dev|mail|localhost/) print $3}' > domains.lst
$ wc -l domains.lst
32957 domains.lst
  • Step 2 — Analyse each domain with Qualys SSLLabs-scan tool

Now that we have this huge list of domains, the next part is to get it through the ssllabs-scan tool. To make sure we go through every single domain without having to setup a complex tracking system, we decided to use a queuing system. AWS Batch is a really nice tool, described as a Fully Managed Batch Processing System at Any Scale. 
With Batch, we were able to use EC2 Spot instances to run our ssllabs-scan tool at minimal costs. We push each DNS record as a job in the queue, and we wait until the entire queue is fully processed. It takes time, but we know that eventually it will be done.

We use a docker container (https://github.com/MindGeekOSS/ssllabs-scan/blob/stable/Dockerfile) to execute the jobs and send the results directly to an Elasticsearch cluster.

# AWS Batch JSON job definition
{
"containerProperties": {
"image": "devops/ssllabs-scan:latest",
"vcpus": 1,
"memory": 2000,
"command": [
"-elasticsearch",
"-elastic_host",
"http://<host>:9200",
"-elastic_index",
"<index_name>",
"-usecache",
"<dns_record>"
]
}
}
# Docker command
docker run --rm devops/ssllabs-scan:latest -elasticsearch -elastic_host http://<host>:9200 -elastic_index <index_name> -usecache <dns_record>
  • Step 3 — Create visualizations in Kibana

Last part is to create the visualizations in Kibana. The reason we decided to use elasticsearch and kibana is that we didn’t want to spend time creating a visualization tool. In the end, the cli tool outputs a JSON that we can index directly in elasticsearch, and Kibana provide plenty enough visualizations that we did not need to create yet another tool.
Often, the path of least resistance is just to re-use the amazing tools that other teams created and released for free.

Kibana dashboard with created from the Qualys ssllabs-scan results
  • Step 4 — Resolve all the issues

We worked with our security team to fix all the issue we found while going through the results. We also realized that a big cleanup of our DNS records was required. We cleaned up more than 30 000 records and fixed quite a few issues related to vulnerable protocols or expired certificates.

We run that process once a month, to make sure that all our websites receive an A or A+ on Qualys. You too, remember that your SSL setup is very important, and keep an eye on your certificates to make sure that your server is properly configured.