What can we learn from the matrix.org compromise?
On April 11th, matrix.org was announced (https://matrix.org/blog/2019/04/11/we-have-discovered-and-addressed-a-security-breach-updated-2019-04-12/) as deeply compromised by an outside attacker. Matrix is self described as “An open network for secure, decentralized communication” so its pretty reasonable to ask why this happened. The TLDR is that even really smart people make mistakes, sometimes its hubris, but in this case it appears to be the lack of some security best practices. The other thing to note is that the attacker was undetected inside of Matrix’s systems for quite some time and developed a detailed understanding of their operations
What happened?
The attacker was nice enough to create GitHub reports for them so they could track and close the problems, it was recapped in this PR (https://github.com/matrix-org/matrix.org/issues/371) by one of the project members. To boil down what happened, outside ssh access was allowed by some poorly configured and vulnerable software. Once SSH access was obtained, the attacker was able to compromise some other pieces of their infrastructure and were then able to exploit that to get into pretty much everything.
How you can avoid this kind of compromise
There are a few key pieces that allowed this compromise to happen but I think the biggest problem was the lack of a properly secured perimeter. If Matrix had been using a VPN to protect their systems, this could not have happened. When you’re using internet facing systems your threat surface is basically all of the internet and using a VPN cuts that down to just your authorized users. Another big problem was their use of GitHub to store secrets and other potentially important pieces of data. This is a fairly common problem and one of the big reasons I tell people to get GitHub Enterprise or self-host GitLab and require VPN access to it. I again point to the fact that really smart people make boneheaded mistakes and the best way to protect your data is to reduce the cost of mistakes.
One of the more interesting pieces of this was how Ansible was used to keep the attacker in the system. The attacker had access to the config repo and knew about a bad sshd configuration. Frankly, one of the reasons that I dislike Ansible is that it requires users and interactive permissions on systems, something that people should be moving away from. I started using Salt (master+minion, no ssh) a couple years ago and it has a bit more advanced security model which basically eliminates that problem. Salt even has a new security product so you can configure/manage and secure your systems with Salt, pretty cool!
Two more important things to consider are that you should have good logging in place and you should use two factor authentication. These things are pretty easy now and TOTP (Time based One Time Password) is basically free. OpenLDAP is easy to setup with this and allows you to securely manage system user credentials without doing something terrible like linking it to your email system. If you want to use SaaS, take a look at FoxPass or if you’re vendor locked to Amazon check out their Directory Service. For logs my go-to choice right now is LogDNA. They’re super easy to integrate and very reasonably priced. They have an on-prem version too so you can scale up really easily.
Security doesn’t have to be hard
It really doesn’t, for the most part you just have to apply common sense. Hopefully this breakdown helps some people get a sense of what to avoid doing in the future. I’m going to try to blog more regularly about this issue and I’ll publish some more checklists and tools to make security easy.
Feel free to ping me if you have comments or questions.
📝 Read this story later in Journal.
👩💻 Wake up every Sunday morning to the week’s most noteworthy stories in Tech waiting in your inbox. Read the Noteworthy in Tech newsletter.