Our response to Meltdown/Spectre

Yesterday the entire IT industry was thrown into quite a bit of turmoil with the disclosure of 2 mega security flaws, Meltdown and Spectre. The how is rather ingenious, but what they allow is access to any memory on a machine, one of the most basic assumptions in IT security, however, is that that isn’t possible.

Fortunately, the one that is relatively easy to exploit (Meltdown) is also easy to patch. The challenge going forward is with Spectre, and we as an industry will have to make even more aggressive assumptions about the capabilities of adversaries.

Patching Meltdown

Linc runs completely on Amazon Web Services on a mix of AWS Lambda & EC2 Container Service (ECS), together with a ton of other managed hosting solutions (Cloudfront, Application Load Balancer, S3, and DynamoDB). The only parts vulnerable to Meltdown are Lambda & EC2/ECS. Amazon had already patched most EC2 hosts to not allow other EC2 instances on the same physical to affect one another by the time the news hit. Lambda is now fully patched.
AWS just notified us of a new version of the Linux distribution we use on our ECS Clusters, and the updates have just finished. 
There is potentially a performance penalty after applying this patch, but we don’t expect any noticeable degradation for our particular workloads. Initial data supports that, but we will keep monitoring it.

That only leaves our personal devices that are used to log into AWS.
All systems that have patches are patched, and I am reasonably sure we are only waiting for an iOS patch.

TL;DR: We are almost completely patched up on Meltdown

Protecting against Spectre

This one is going to be much harder because we have to assume that anyone who can run code on a machine can read everything in memory. 
The bad news? Our business is running JavaScript supplied by our customers on our infrastructure. The good news? We are a Front-end Delivery Platform, so almost everything is public information.

So to understand protecting against Spectre we need to talk threat models. There are two categories. The major one is a Denial of Service or Malware injection into one or more of our hosted web applications. The other one is exfiltration of sensitive data.
Let’s start with the easy one first.

Exfiltration of sensitive data

As mentioned above Linc doesn’t store a ton of sensitive data. The most sensitive data we store ourselves is email addresses. They are stored in DynamoDB with almost no service that has access to them.
Access keys and secrets are stored in Auth0 and credit cards in Stripe. Neither company has released any statement on their response yet, but they both have incredibly competent development/operation teams, and I am confident they are on this.

DOS or Malware Injection in customer application

Spectre by itself cannot change any data, only read. So this makes a successful a lot harder. But the three most vulnerable approaches are:

  • Get access to customer Access/Secret keys.
  • Modify the JavaScript at rest on S3.
  • Modify the lookup table from domain to code

Get access to customer Access/Secret keys.

Getting this access would be very difficult, but even if an attacker was able to get them any operation that they use it for would be logged and send to our Webhook API and Slack integrations for customers to monitor.

Modify the JavaScript at rest on S3.

Again tough one to pull off. It would require gaining access to not just S3, but also the database where checksums are stored.
But as an extra security measure, we have enabled encryption & versioning so we can detect any changes (the file are supposed to be immutable). And shortly we will build monitoring for any files being overwritten just in case.

Modify the lookup table from domain to code

This one is more tricky. Linc works by looking up the domain of the request and match that with code. In normal circumstances, the executing customer code does not have access to any IP addresses or password information for any of the databases. And for the primary database that would give read access only anyway. But for our caching layer (Redis) it is impossible to give read-only access easily and we have relied on just the password until now.

Interestingly enough during a routine security review just before Christmas, this was already identified as the weakest link. The long-term solution is to split the lookup code and the rendering code on different ECS clusters in different subnets so that the code can’t connect to the machine at all.
In the short-term, we will rename all the destructive commands in Redis to an unguessable string and monitor all commands for weird behavior.

Conclusion

To use the most overused and useless phrase you’ll hear over the next few days: “We take your security very seriously.” But I want to absolutely stress that we do take your security very seriously. For realz.
We are a growing startup and won’t get it always right. But we are committed to balancing new features and improving security. For example it is still too hard to set CSP headers, which is high on our list. The splitting of our code and customer code is a pretty big investment in both development resources and extra infrastructure cost. It will probably be done later in the year.

But I promise we will never willingly leave open a vulnerability we are aware of and we are always open to discussions/questions/suggestions.

Want to host your Single Page Application/Progressive Web App on the most progressive host platform? (see what I did there?)
Head over to
https://bitgenics.io for more information or hit us up at hello at bitgenics io