Agile methods and DevOps have increased the speed of software development. This has challenged traditional approaches to software security work. Today we need to make sure that we also increase the pace of security with the means of DevSecOps. In this article, I’ll present a framework and how to get started with key DevSecOps domains and activities.
Back in the day, developers, testers, networks, platforms, and operations were separate IT teams. Between the teams there were formal procedures often described as gates. Once, those gates were considered a security best practice. Their function was to make sure that the security compliance at certain points of IT processes is met. The compliance was audited with, for example, reviews and penetration tests. Those often either delayed a project or were performed to configurations and software versions already old at the time the audits finished. Clearly, there was a need to shift security activities left from the after-the-fact audit phase and away from the audit culture.
There was a need to shift security activities left from the after-the-fact audit phase and away from the audit culture.
DevOps is a set of practises that speed up IT processes, including software development. The big idea of DevOps is to remove barriers between the various IT teams, most importantly development and operations. In DevOps teams, the developers, administrators, and operators are working together, sharing a common concern and ownership of the work methods. This has enabled faster development and deployment of new products, features, and fixes. One consequence of this thinking is that security activities should be included in the development and operations work. Compliance gates that create obstacles and delays are inherently not compatible with the agile DevOps work concepts.
Many organizations are getting the DevOps benefits today. This is a positive development as good DevOps culture also supports security. Let’s have a look at how security concerns, objectives, and requirements fit in.
Compliance gates that create obstacles and delays are inherently not compatible with the agile DevOps work concepts.
Domains of DevSecOps
Before discussing the practicalities, we need to define the scope of DevSecOps. Like DevOps, DevSecOps has many process and cultural aspects to it. The goal for the following framework is to describe security activities across DevOps scope.
There are three essential security domains within DevSecOps thinking:
- Development security where the goal is to make a software product or architecture that meets the security goals.
- Pipeline security where the concern is making sure the build and deployment automation toolchain supports security.
- Production security where the aim is to have a stable and resilient environment for production and for development tools.
Let’s have a closer look at each domain, what kind of security activities they consist of, and how those support DevOps and security of end product.
A key security concern for any product owner is to have a product with as few vulnerabilities as possible. What kind of activities can the owner and the team take to reduce them?
Development security deals with the design and technology of the product. Security is an aspect of software quality. Poor software quality is essentially a consequence of architecture and technology choices, mistakes in design, and errors in implementation. The same reasons will also lead to security vulnerabilities. It can be argued that activities to improve software quality will lead to better security.
The following activities should be considered to improve development security.
Planning: At the planning stage, the idea of what kind of a product we are about to build is documented. In an agile project, this means creating a prioritized backlog of tasks. At first, tasks include design and architecture activities, and later on more specific programming tasks. The project team will follow the tasks that will eventually take them to a minimum viable product. Clearly, identification and documentation of the tasks that support security features and functionality is essential at this point. Security teams often become frustrated as their policies are not effective to steering development activities. Unless those policies are translated in tasks on the backlog, the DevOps team is likely to disregard them.
To start populating the backlog with security tasks, first decide to have a way of identifying those tasks that have important security implications. An example of such tasks could be implementation of encryption. Then, perform threat modeling to identify the security subtasks connected to that task. Those could include, for example, tasks for design adjustments, building test cases, or need for a specific additional functionality. In case of encryption, for example, a subtask could include design activities for managing secrets needed for encryption. By documenting those subtasks, security work becomes visible and can be prioritized like other project tasks.
One of the popular model of threats is STRIDE. Although is does not cover the modeling process, it does provide a repeatable framework for what kind of weaknesses to look for while modeling threats.
In addition to threat modeling, OWASP documents are a great source for practical advice on identifying security requirements, functionality, and best practices for web applications.
Identification and documentation of the tasks that support security features and functionality is essential during planning.
Implementation: At this stage, the team converts work items from the backlog are into source code. This includes the application code, but also the implementation of test automation, infrastructure as a code (IaC), and other supporting tooling. In practice, this also means implementing the security features and functionality identified and prioritized during design and threat modeling.
Bugs are introduced in the software during implementation and some of those bugs are security vulnerabilities. The rate at which vulnerabilities are created and how dangerous they are is tied to the programming language, libraries, and frameworks that are used. Programmer’s experience is also a factor in this.
A key quality and security objective during implementation is to reduce the number of bugs. Any bugs left hidden inside the code might be discovered and exploited later by attackers. The first step of addressing this area is to make sure secure coding practices are known by the team. What that means for each project is different and tied to the programming language, libraries, and frameworks that are used.
Secure coding practices can also be enforced automatically by integrating tests into the pipeline. We’ll take a closer look at that as part of pipeline security.
During implementation, we need to mind the security features and functionality identified and prioritized during design and threat modeling.
A key concept in product development and DevOps is the pipeline. Could we take advantage of pipeline automation to improve security without slowing development down?
The pipeline is a collection of tools integrated together for automating the steps from source code to production deployment. We consider this area with two distinct subdomains, continuous integration process security and continuous delivery process security.
Continuous integration process security
Continuous integration (CI) refers to a practice of hosting a single source code repository where changes are committed typically several times a day. After a change is merged, a build is automatically compiled and tested.
Merging: When code changes are committed to the repository by team members, they need to be merged to the main code. Depending on the security risks of the product, it might be a good idea to facilitate code reviews at this point. Code reviews are conducted by other team members to verify that the submitted changes adhere to the security coding standard and contain no human mistakes. By setting a master branch policy to require peer reviews, there is a better chance of catching bugs before merging them into the product.
Code reviews and branch policies are used to catch bugs before they end up in the product.
Building: The build phase builds and packages applications and services into executable formats. This is the right phase to integrate automated activities to analyze code. Static application security testing (SAST) procedures and tools can be used to discover programming errors that might manifest as vulnerabilities. SAST is white box testing, performed with the benefit of having access to source code and other internals.
Another important security activity during build is to verify security of the dependencies. All products use libraries and other external components, and we need to make sure we are not linking with known insecure versions. In some cases the product risks might warrant extending testing activities to 3rd party libraries.
One security decision to make is selecting and configuring a container format that supports our security goals. In some cases workload security compliance objectives require us to include security guardrail components inside the container. The purpose of these components is to monitor and report about workload security metrics, such as configurations, during run time.
We should implement tools and automation to verify security of our code and dependencies during building.
Testing: In CI a key practice is to automate build testing. Automated security testing tools can be implemented to make sure a security baseline is being followed. Testing basic security configurations and functionality in manual penetration tests is typically inefficient. Those tests should be automated instead, negating the need to explore security basics manually. For example, when the product includes a database system there should be an automated testing tool to verify that known secure settings for that database are always being followed. When as much of the security testing as possible is automated, manual exploratory testing is needed to cover only the functionality outside automated tests. This increases the value of traditional penetration tests.
A dynamic application security testing (DAST) procedures and tools can be used to discover basic security weaknesses in a web application. DAST is essentially black box testing, executed at the interfaces of the application without knowledge of the code or other internals.
Automated security testing is used to verify the security baseline and discover weaknesses from the build.
Continuous delivery process security
Continuous delivery (CD) refers to procedure of packaging a version of the product and typically deploying automatically to a staging environment. When product versions are pushed automatically to production as available, this process is referred to as continuous deployment instead.
Releasing: When a product is prepared for release, the built and tested components are collected to be packaged together. From security point of view, we want to make sure this package contains the versions we expected and nothing else. All build artifacts should be signed to verify that they have successfully passed the pipeline and have not been tampered during the process. They are then pushed to the artifact repository where they can be fetched to production. Modern examples of these are container registries such as Docker Hub or Azure Container Registry. Depending on which repository is being used, the security of the product depends on the security of the repository. Artifact signing mitigates this risk, given that the keys used for signing are managed properly.
Build artifacts are protected with signing and stored securely in a registry.
Deployment: When deploying to staging or production, we need to make sure that the platform fulfills the security requirements we have. A modern way to approach this is to deploy infrastructure as well (IaC). This way we can create, maintain, and version that in code as well. Platform and security feature deployments become repeatable.
Regardless of whether infrastructure is deployed or reused, we have to test the infrastructure security and verify the integrity of the platforms. Post-deployment security tests should be used to automate verification that the deployment was successful. For example, it is important to make sure security groups and network configurations are what we expect, and the logging pipeline is available to capture and store events, including security ones. Furthermore, we should test that the required hardening options on platforms and secrets, like encryption keys, were deployed correctly.
A large part of practical IT security is how we operate our platforms and applications. What are the main activities to support security?
Operations security deals with security aspects of a software product when deployed as well as the development tools. This domain combines security prevention, detection, and response elements. A key goal is to prevent attackers from breaching source code, pipeline, platforms, workloads, and data.
A common hindsight in operations security is to focus on the production environment. In DevSecOps context, we must extend these practises to the entire flow. More specifically, the following tool and systems should be covered at minimum:
- Source code repository
- Pipeline tools
- Staging environment
- Production environment
In cyber security it is safe to assume that we cannot prevent all breaches. Therefore, it is vital to have visibility and response capability to security events in production. This enables effective mitigation of attacks and misuse of functionality and data.
Practicalities to implementing secure operations have the following areas of activity.
Hardening: Hardening refers to securing the platform and workload settings to prevent misuse. Many vendors provide their own guidance for recommended security settings. There are also independent sources such as CIS that are worth looking at. The understanding what configurations can be considered secure enough evolves over time. It is therefore important to keep track of a baseline and update as necessary.
Hardening activities can be manual, especially when related to existing or reused platforms. For efficiency and consistency, it is advisable to implement settings automatically during deployment thought IaC definitions or platform policy automation.
We must harden platform and workload settings to reduce attack surface.
Backups: Making sure the important data is backed up is basic security. A backup policy should be created to set the baseline for what data will be backed up and how the backups are archived and eventually discarded. An important and often neglected process is the regular testing of backups. The are few things as useless as backups that cannot be recovered when needed.
Testing our backups should be regular and automated.
Updating: Software vendors publish security updates as vulnerabilities become known. Having a production environment with updated components is security basics. The solution is called a patch management process. A typical process is set up to periodically update production workloads with available updates. These include for example operating system, library, 3rd party component, and application updates.
A modern approach is to deploy immutable production workloads. Those will not be updated in a traditional sense in production. Instead, a master image is updated and automatically deployed to replace production workloads.
Our workloads should be updated in production or redeployed when security patches become available.
Operating: Securing the privileged access of operators and administrators is a key operational security aspect. Most vendors do have their own guidance on how to do that. For example, Azure documentation on the topic is very practical and easy to follow. Common recommendations include using different levels of administrators for authorizations, using MFA for authentication, logging all administrative operations, and implementing bastion hosts to access production environment.
Securing the privileged access of our operators and administrators is a key operational security aspect.
Exploratory testing: The purpose of exploratory is to discover vulnerabilities by examining the target as a whole when ready. In essence this is manual testing to make sure security of the functionality that could not be tested automatically. Examples of exploratory testing include penetration tests, bug bounties, and red team exercises. Each team should find an appropriate combination of testing types based on expected value.
Manual security testing is appropriate when we could not automate testing.
Monitoring: Security events monitoring is an essential part of a good security posture. Without monitoring, attacks and misuse of functionality and data will go undetected for possibly prolonged periods of time. Building this capability starts with building visibility by having applications and platform log security events. A logging policy should define what events to log, what information to log, and in what format.
A logging pipeline is the system used to collect logs from systems and applications and stream them to interested agents for consumption. Agents include, for example, monitoring tools, dashboards, and long term storage.
Monitoring for interesting security patterns should be tied back to threat modeling and risk assessment work. Platform default rules and alerts are seldom enough in real use cases.
Security events will not go unnoticed when we do have good visibility to the production and proper automated monitoring.
Responding: Planning and rehearsing a good response process will help you to remove attackers from the product faster. A fast and controlled response saves money by mitigating for example productivity and reputational losses.
The work starts from identifying the most likely incident types and developing response playbooks for them. Those include both technical and process aspects. Technical aspect include, for example, what information to gather and investigate during an incident and what tools should be used. A key activity in preparing for incidents is to make sure there are log sources to support quick and efficient analysis of the matter. Process aspects include, for example, defining roles for who is responsible for what, guidelines for incident prioritization, and escalation procedures.
Being prepared for responding to incidents is usually the most overlooked cyber security activity. One reason for this is that it is not a technology matter. Instead it involves preparation and practicing, creating processes and maintaining skills. The sign of mature DevSecOps organization is that they have invested into this area in particular.
Investment in response planning and rehearsals are signs of mature DevSecOps organizations.
Getting started with a DevSecOps initiative can be difficult without clear priorities. Which of the domains should you start fixing first, development, pipeline, or operations?
I hope you have found new inspiration to begin improving your DevSecOps. I recommend a risk-based approach where you take a look at your current set of tools and practices together as a team.
DevSecOps activities aim to mitigate a wide range of threats. The right balance of security activities must be selected to support each specific project and product. A clear sign of risk is lack of activity in any given activity area.
When faced with a limited budget, try to prioritize improving the activities that support detecting and responding to security problems, namely Monitoring and Responding. That way at least you can detect and deal with security events in production as they happen. The second step should be focusing on how to prevent vulnerabilities from being introduced in the first place.
We will dive deeper into DevSecOps practicalities and implementation advice in our future stories on medium.com/fraktal. I hope to see you back!