The encryption fallacy
In my prior posts on Cybersecurity for Executives, I explained how to protect data by blocking access from the Internet, on your internal network, via authentication and authorization strategies like segregation of duties and MFA. The chances are that at some point, you still may experience unwanted data exposure by someone who did it inadvertently or maliciously. Here is where encryption may or may not help you.
Encryption is a mechanism for turning sensitive data into something unreadable and unusable. Even if someone can see or access the data, it will be of no use to that person. They can’t read it or know what it says. The phrase encryption at rest refers to encrypting data before storing it — wherever that may be. You can encrypt data stored in a database, in a file, or new types of cloud storage like AWS S3 buckets or Azure Blob Storage. Typically some software will pass your data into an encryption algorithm along with a bunch of letters and numbers called a key. The output will be your data in an encrypted format. Let’s see what this looks like via a simple example.
Let’s say I want to encrypt this piece of data:
Cybersecurity all the things
That combination of characters in computer-speak is called a string. It’s a set of characters. (A string is treated differently from pure numbers by programming languages.) If I want to encrypt that data, I have some options. Here are a few:
- I could encrypt only the string.
- I could save the string in a file like Microsoft Word or notepad or text edit and encrypt the whole file.
- I could store the file on my operating system (Windows, Linux, or Macintosh, for example) and then enable the feature to encrypt my entire hard drive. Some operating systems and laptops have the ability to do this but you have to turn it on. Now every file on my hard drive is protected. Maybe.
- I could store the file on a cloud service that encrypts everything for me, so I don’t have to think about it. Maybe.
Let’s look at these options in a bit more detail.
The following example is for demonstration purposes only, not a recommendation of how you should encrypt files. I am attempting to show readers who don’t usually do the encryption themselves what it looks like using via a basic example. I’m using a program called OpenSSL which comes pre-installed on a Macintosh computer. You can run these commands yourself using the Terminal app.
First, I type this command to create an encryption key.
openssl genrsa -out key.txt 2048
After typing that command, I will get output like this:
The command generates a key and stores it in the file key.txt. If I open and look at the file here’s what I see:
Note that you should never, ever, share your private key like I just did!
As you can see, a key is simply a long string of random characters in a specific format that works with a particular type of encryption method, called an algorithm.
Let’s say I want to encrypt my string using my key. To encrypt my string above, I type this command:
echo "Cybersecurity all the things" | openssl rsautl -inkey key.txt -encrypt >encrypted-string.bin
Now I have a new file in my directory called encrypted-string.bin. If I try to open that file, I’ll see a lot of meaningless nonsense. The output will look different depending on what you use to open it. This encrypted file is in a format called binary. A text editor can’t decipher it and will produce something like this if you try to open it. That’s not exactly readable, is it? And that’s what we want when we encrypt files! If I encrypted this file with a secret key you didn’t know — then in theory, you can never read my secret file.
If I have the secret key and I want to get the data back, I can pass in the key, and my encrypted file to the encryption algorithm using this command:
openssl rsautl -inkey key.txt -decrypt <encrypted-string.bin
Magic! And I get back my text. Actually, it has to do with math, not magic, but not going to bother with all that right now.
So if I encrypt my data, no one else can read it, right?
Think about that for a minute…I’ll wait. Is it true that no one can read your encrypted data under any circumstances?
No one else can see my data — unless they have my key. What if I publish my key on a blog anyone can read? What if I put it in a public source code repository for anyone to download? What if I store my key right next to the encrypted file on my laptop. A lot of good that encryption is doing me, right? If malware gets onto my computer, the attacker can run the command I just ran and decrypt my data.
What do I need to do to protect my encrypted data? I’d need to store my key somewhere separate and secure. I’d also need to make sure the attacker couldn’t get into the memory on my machine while the data was being encrypted or decrypted. Otherwise, the attacker might get the key or the data at that point. That’s how attackers stole credit card data from Target point of sale systems. I also need to make sure no logs are storing my data or key in a manner that can be useful to attackers. Additionally, I’ll need to use an appropriate encryption algorithm and ensure whatever software I’m using is using proper techniques to encrypt the data. If you use an insecure algorithm, an attacker may be able to crack your encryption and get the data that way. Improper programming leads to flaws that can leak data as well.
For option two in my list above — encrypting a file — the same complications apply. I could copy my string and paste it into a file and then save the file. Then I run a command to encrypt the entire file instead of my one string. If I have a lot of strings, I can put them all in the file and then I can encrypt them all at once. The output is longer so it will be harder to guess for reasons in my post on hashing where I introduced encryption. A new problem results from storing all the data in one file with a single key. If an attacker gets that one key and file, he or she can access all my data. In security, it seems like we are always trading one problem for another and trying to choose the best option for the scenario at hand to reduce the risk.
At any rate, you can see that although encryption may be hard to break, other factors may still expose the data if we are not careful. Encrypting information is one thing. Leveraging encryption in a way that truly protects all your data is another. Someone can tell you all your data is encrypted, but if they are not doing it carefully, that encryption could be meaningless.
How about option number three? Operating systems and the hardware they run on have improved in ways that allow you to encrypt data on your hard drive automatically. You can enable encryption options within your operating system to try to prevent someone from reading the data on a stolen laptop. In some implementations, your key may be stored very securely by the hardware in your computer so no one can get at it. That sounds perfect, right? Now all your data is protected. You don’t need to worry about managing an encryption key, assuming you trust the hardware and software vendors to build it all securely. Problem solved, right?
Sort of. If your hard drive is encrypted and your stolen computer not running, everything is fine. But what happens when you are working in the coffee shop, and your computer is running. You are editing one of your encrypted documents when you step away to get your favorite half-caff-mochachina-macchiato-soy-vanilla-extra-frothy-fru-fru drink. Someone swipes your running laptop while you are not looking. If you didn’t lock the screen with a screen saver that requires your password to unlock, anyone who has access to the machine could open and read that file, including malware.
Here you can see that encryption helps, but it’s essential to understand in which scenarios. It’s also critical to train your employees to lock their computers when they step away. Better yet take them with when you get up if working at a coffee shop to avoid theft altogether!
If you walked away from your computer without locking it at one place I worked, you would end up with emails sent on your behalf as a little reminder. Lock your laptop when you step away. The office was full of pranksters. “I” once invited our entire development team out for drinks at happy hour — on me. Needless to say, I had to renege. The best one I heard was the guy who changed someone’s spell checker to type the word “bananas” every time the person typed the word “the.” She didn’t know what was going on and had to call the help desk to try to explain. I can only imagine how that call sounded. It still makes me laugh out loud as I’m typing this. That’s evil.
How about cloud magic to do your encrypting? Some cloud providers tell you that they will encrypt all your data for you. They use the best encryption algorithms and go to great lengths to protect encryption keys. Their best practices in the cloud make it easier for you — one less thing to worry about. What could go possibly wrong in this scenario? It’s perfect! Or is it…
Let’s look at the Capital One breach I wrote about previously. Capital One has a lot of people doing the right things. They had policies that enforced encryption on S3 buckets (used to store data in the cloud). AWS policies rejected file uploads that contained unencrypted files. I don’t know if they still do, but that’s an excellent policy, and I always recommend it. Why spend a lot of time figuring out what you should and should not encrypt? It’s possible to encrypt everything in the cloud. The cost of encryption is almost certainly less than the expense of being out of compliance with regulations or having data at rest stolen in an unencrypted format. But in the case of Capital One, the encrypted data was still accessible. What went wrong?
The attacker essentially had access to the encryption key. Although the attacker may not have had the key material (the data making up the key itself), the attacker could use the key to decrypt the data. Effectively that’s the same as having the encryption key. The attacker presumably got onto a machine that had permission to decrypt the data. Then the attacker could run commands on the virtual computer that had these permissions to obtain unencrypted the data even though the attacker didn’t have the key. AWS has mechanisms for creating more restrictive policies on the use of encryption and decryption. Other architectural changes would have helped as well as I wrote about in another blog post.
One small glimmer of light is that Capital One used another technique to protect US Social Security Numbers. Tokenization replaces a sensitive piece of information with placeholder data instead of the secret information. When the data is processed, if that value is required, it can be retrieved using that token. (It might have also just been an irreversible hash.) The social security numbers that were tokenized were not exposed. Unfortunately for our friends to the north, the system didn’t tokenize the Canadian IDs.
Ok, enough gloom and doom. What did we learn? Is encryption useless? No! At least not yet. Hold that thought. The issue here is that a lot of compliance frameworks and cybersecurity best practices tell organizations to encrypt data — but often when best practices say you should encrypt data, they don’t say how. They don’t emphasize the need to protect encryption keys adequately or how to architect systems to prevent data loss even when encryption is in use.
The fundamental problem is that encryption needs to be done correctly to be useful, and either people aren’t adequately trained or don’t understand the risk associated with implementation choices. Companies do it without really understanding all the details or performing adequate threat modeling. When organizations encrypt data, they need to know how to do it correctly and consider the big picture of the overall security architecture.
I think it’s much easier to manage encryption in the cloud — but you have to trust the cloud provider. Establish trust by looking at the cloud provider track record in terms of data breaches and cybersecurity practices. Assessments and audits help with the things we can’t see when trusting a cloud provider. Even if you believe the cloud provider will do their part correctly, you still must understand how to implement your part of the security architecture to avoid a breach.
AWS and other cloud providers help you protect your encryption keys by tying the ability to access them to authorization by way of authentication. Major cloud platforms allow you to create technical policies that require people or resources to have appropriate permissions to use a particular key to encrypt or decrypt data. It’s better than having a key that anyone on the planet can use, but you’ll still need to think through the entire threat model. Threat modeling involves thinking about how the system may be vulnerable to attack). You have shifted the problem of protecting the key to safeguard the credentials and thinking about what they can access, should they be exposed.
In an on-premises environment, companies use something called a Hardware Security Module (HSM) to protect keys. A file can be copied and transferred off the network. When embedding the key in a tamper-proof hardware device, the encryption key is a bit harder to access. Hardware-based encryption may also be faster and offer performance benefits. HSMs are possibly and technically the most secure option for storing keys. However, HSMs come with a whole slew of complexity and management issues that may make them hard to implement in some environments. I’m not going to get into that here. Most security professionals consider them one of the most secure ways to protect encryption keys, but other factors sometimes make them a less than ideal choice.
You can get an HSM in the cloud from the three major cloud providers — AWS, Azure, and Google. The problems they cause involve limited scalability, latency, and disaster recovery problems. As a result, cloud providers came up with alternative approaches that are more automated and work better with cloud architectures. I discuss all this in my cloud security class, but many companies do use these automated methods, and they are, in some cases, FIPS-compliant for those who require it. When thinking about how encryption keys are stored and used, it’s important to look at the big picture to determine what the actual threat vectors are and how to prevent breaches. So far using these automated mechanisms for key storage have proven to be robust — if appropriately used within the broader security architecture.
The other element that affects an attacker’s ability to decrypt the data is the selected algorithm. As explained in a previous post, you can think of an algorithm as a formula in a larger computer program. You pass in some inputs — the data and a key. The output is encrypted data. If the attacker doesn’t have the key, he or she still may be able to crack the data if the algorithm is weak. When implementing encryption, developers need to use the right mode with some algorithms. I go into more details on this in my class. The main point is to make sure you and your vendors are using the correct and recommended algorithm and key size for whatever type of encryption you are implementing. Some algorithms have been cracked or demonstrated to have weaknesses and are no longer recommended, such as DES, Triple DES, and MD5.
If you watched The Imitation Game, although it might not be precisely accurate, it is a fascinating movie. The movie presents the story of how Alan Turing and his staff cracked the German encryption code (algorithm) called Enigma, to effectively end the Second World War. Why would breaking the code end the war?
What if your enemy knows everything you are going to do before you do it?
We count on encryption in many ways to protect our data — and communications. (More on that later.) As computers have gotten faster and faster encryption algorithms that used to take years to break may take minutes. Scientists are warning that with the rise of quantum computing, today’s encryption algorithms may become obsolete. Forward-looking individuals working on or with cryptography may want to think about new ways to protect our data before that time comes. Organizations protecting communications might want to invest research dollars in this area. Some people are already working on quantum encryption algorithms.
Encryption algorithms are hard to create correctly, and the general rule is don’t try to create your own. Usually, it involves many different people and years of review before the industry deems it to be secure. Various attempts to protect data using encryption failed due to logic errors such as splitting a piece of data in half and encrypting each half separately. The longer value is harder to crack than two short values. Another company failed to randomize the values used in the encryption process. Additionally, developers store keys in places that are easy to reverse engineer in software licenses, DRM, and IOT devices.
All these complications and considerations lead me to my final points about encryption at rest. What should you do?
Encrypt everything. It still helps. That’s why one of the items of my list of questions to ask security teams is — what is the percentage of data encrypted at rest? Even better would be to ensure proper encryption implementation, but we need to start somewhere.
Think very carefully about key management — you’ll likely want to include some form of authentication. Refer to my prior posts on credentials and authentication. Segregation applies to encryption key management just as it applies to people and credentials.
Train staff using encryption to use it properly. They need to understand the threat vectors, perform threat modeling, and implement all aspects of encryption correctly.
Leverage segregation of duties. Separate the people who manage encryption keys from the people and systems who use them. Amazon CTO, Werner Vogels, says separate the people from the data. All the things I wrote about in my blog post about The Aftermath of Stolen Credentials applies to encryption key management. How much data is exposed when a single key is stolen or accessed?
Encryption keys are passwords that provide access to your data. Consider how much data will be exposed if someone obtains access to a key. The longer, the better, just like a password. However, different algorithms work effectively with varying lengths, so the size of the key is relative to the length typically used with a specific algorithm. Key length and algorithm used will depend on many factors and explaining all of that is beyond the scope of this post. Just like a password, it’s a good idea to change your encryption keys regularly, otherwise known as key rotation. That is sometimes easier said than done, but if an attacker is trying to guess or obtain your key, frequent changes make it harder.
Finally, remember — if you lose your key, you lose your data. For those bitcoin owners out there, hopefully, you understand that bitcoin is a form of encryption. Your bitcoin is a key which is a password to your money. If anyone gets your key, they get your money. If you lose your key, you can’t get your money back. The same is true in corporate environments. How will you prevent your encryption key from being lost? If it is, do you have a secure fall-back mechanism to access the data?
Hopefully, that helps explain encryption at a very high level. If you want to protect your data, encryption is one of the many tools that can help you. We architect security to use all the tools together in ways that make it harder to get at the data. When leveraging encryption, system developers and architects need to consider many factors, and it has to be done carefully and correctly to do its job. But in the end, encryption is still a critical part of your overall security plan, and you should use it as much as possible — properly.
© 2nd Sight Lab 2020
Want to learn more about Cloud Security?
Check out: Cybersecurity for Executives in the Age of Cloud.
Cloud Penetration Testing and Security Assessments
Cloud Security Training
Virtual training available for a minimum of 10 students at a single organization. Curriculum: 2nd Sight Lab cloud Security Training
Have a Cybersecurity or Cloud Security Question?
2020 Cybersecurity and Cloud Security Podcasts
2020 Cybersecurity and Cloud Security Conference Presentations
Prior Podcasts and Presentations
Azure for Auditors ~ Presented to Seattle ISACA and IIA
OWASP AppSec Day 2019 — Melbourne, Australia
Bienvenue au congrès ISACA Québec 2019 — Keynote — Quebec, Canada (October 7–9)
White Papers and Research Reports