your hard outer shell is BS
There’s an oft used analog that data centers are architected with a hard outer shell that protects it from external threats except through a very limited set of vectors designed into the architecture. This is a giant load of BS.
This idea has been perpetuated into internal architectures for areas that are supposed to be even more secure than the datacenter in general. We’ll just save the world through network partitioning! Yay! Network partitioning is actually a valuable tool for multiple reasons, but it isn’t much of a security solution.
The entire concept of a walled garden has proven to be completely unreliable over and over again. All networks should be assumed compromised and we should prepare as such. We can still use network partitioning, but there are a lot more strategies needed to combat malicious actors.
Encrypt All The Things
Everything needs to be encrypted if you don’t want someone to see it. This means encryption in flight as well as at rest. A lot of companies terminate TLS ahead of the application on an F5 or some other piece of equipment. While some PaaS systems like Cloud Foundry terminate at the edge of the cluster on the routing layer (this is changing soon if not already). These are unacceptable practices, and the industry trend is to terminate at the application or using a sidecar pattern where a partner application like nginx can terminate the TLS and route the traffic locally to the application. With containers, this can be done through the pod abstraction in Kubernetes.
All data at rest must also be encrypted. Preferably, this will be done using very fast and efficient hardware encryption. However, it can be done within applications. The most important part around encryption is that you shouldn’t build the tooling yourself. It is wildly hard to get right and there are plenty of tools available.
Even with hardware-level encryption for your data at rest, you’ll still need to ensure PII is encrypted. There are many ways to do this. One way is to simply use a Secure Hash Algorithm (SHA) to mask the data. However, this won’t allow the data to be used easily for analysis. It’s likely the strongest solution available…until it’s finally broken. There are multiple sizes for the key, so it should survive for quite awhile.
Another method of encryption would be Format Preserving Encryption (FPE). This allows data to be encrypted in the same format as the original data. Analysis, manipulation, and validation become much easier using this system. This is also an easy way to have testing data mimic production data without giving production data to everyone. Reducing the number of people in your company with access to real production data is probably the number one way to decrease your security vectors. It will be far more effective in your data loss prevention strategy than creating walled gardens. Walled gardens are a cure for a symptom and not a cure for the problem. Using encryption for your data that can be translated will also make it easier to move to the public cloud with your data. One downside to FPE is that it’s vulnerable to frequency attacks. However, this is only common in small domains and even then there are protections against it.
One of the big advances we’ve made in intrusion detection is around behavior analytics. This involves using behaviors’ of users and applications to produce a baseline and then alert when aberrations occur. In a mature environment, detection of these anomalies would automatically trigger remediations and notify the parties concerned to ensure the behavior shouldn’t be considered normal. Slack (the company) actually does this with users on their systems. If a user enters particular commands or is performing uncommon commands, they will get a message in Slack (the app) where they must verify it was them. They will also get a message on their phone which they must respond to. This ensures that there must be two factors for authenticating that the user hasn’t been compromised.
This requires that you can analyze your system in real-time. If you’re sending your logs out for analysis, then your data is already gone. Log aggregation systems with built-in security and behavioral analytics are a necessity. These often integrate with modern tooling and automations to provide real-time alerting and remediation.
Authentication and Authorization
Every system must have authentication and authorization built-in. It’s a good idea to centralize the system which manages this, but federate the responsibility and authority for access. The problem with centralizing authority and responsibility is that it becomes a bottleneck where not much value can be provided. These systems often require an authority on the side of the affected system, and then the centralized authority mostly becomes data entry with minor validation. Automating the validation and creation of accounts using reasonable guardrails will allow the authority on the affected system who is actually giving the approval to implement the change if it passes the automated validations.
Having a centralized system also provides for a common interface for all internal users and customers. The customer and internal user systems may be separate, but there should only be one for each of those groups. Consistency goes a long way in security and compliance.
Passwords. That’s a dirty word to me. If you have a single sign-on system, then it’s probably not a huge deal to have the password change. However, I believe it is actually a greater risk to have someone change a password when they have ten different passwords for your company. They are going to write it down somewhere. I know that password managers help, but they aren’t perfect yet. Always implement multi-factor authentication (MFA) rather than just a password. If you use MFA to get into your network, then you should also use it within your network.
When it comes to security, I much prefer having no passwords. I’d rather do all communication using some type of key. We don’t type passwords into our homes to get in. This can be done by embedding special rotating keys into the browsers of your associates and customers and rotating those keys regularly through a configuration management system. SSH also has the ability to use keys for connections, but SSH should be disabled for nearly everyone in your company.
An authorization system will likely use the authentication system, but authorizations should be controlled at the domain level. This may be a domain like all Unix systems or a subset of Unix systems or a particular application or a particular region. It just depends on the particular situation. This would be far too much for a central system to maintain.
I’m guessing most people would agree with a lot of what I’ve already said, but this section might cause some controversy. Moving fast can actually make you more secure. This is a multi-faceted observation. One aspect is that moving fast allows systems to be patched more quickly. This is true from the OS up to the app. If a development team is deploying multiple times a day, then the system can be updated with an OS CVE fix within hours of its release. The fix will go through the dev cycle with the new application code, then the testing cycle, and finally the rollout to production. This is all handled using a configuration management platform and a deployment pipeline. I think this is a rather obvious illustration of how moving fast makes your system more secure.
A more obscure observation is that moving fast also reduces the amount of time a hacker has once a piece of the system has been compromised. If you replace servers every hour by bringing up a new server that is identical to the current server except with new keys, then transfer traffic to the new server, and then destroy the old server, then a hacker will only have a maximum of one hour to find additional vulnerabilities. This observation applies to all components of a system. Access tokens, encrypted keys, servers, switches, application, and any other components should be refreshed regularly. This also makes you better at doing all of these things, which increases consistency and reliability.
Walled gardens and hard outer shells aren’t security; they’re wishful thinking with unrealistic expectations. Security is an incredibly difficult domain, but there are several key components you can attack today. I would focus primarily on analytics, encryption, authentication and authorization, and moving fast. The order is dependent on your company needs, but I’d focus on encryption and moving fast early in the process. Moving fast will likely be something development teams will want to push and information security will likely want to ensure that if anyone does steal data, then they won’t be able to use it. It’s time we start solving problems rather than symptoms.