Threat Modeling Handbook #5: Convert your threat model into an automated pentest using DevSecOps tools (with Examples)

Published in

AppSec Untangled

13 min readOct 12, 2023

Now it is time to make use of what we have completed in phase 1 of the threat modeling process (threat and mitigation identification) in the real world! In this story, we will discuss how to use the threat model to run a pentest (phase 2: Verification), and even better an automated pentest that can run in the CI/CD pipeline, for new PRs, or as a scheduled job using DevSecOps tools (phase 3: Continuous tests).

For more details about the threat modeling process and its first phase, you can check the below links for the previous stories of this series:

To test “Security” you need to first define “Security”

As we discussed in story 2 the best time to complete phase 1 of threat modeling (threat and mitigation identification) is the “Design” phase of the SDLC (Software development life cycle). After completing Phase 1, the “Implementation” phase should include implementing all the mitigations identified in it.

The “Implementation” phase should complete the mitigations identified in phase 1 of the threat modeling process

As you can imagine, more often than none there are discrepancies or misconfigurations between the “Design” and the “Implementation”. That is why a “Testing” phase is needed in the SDLC process to verify that what has been implemented aligns with the “Design” requirements. This usually refers to the unit tests and/or the integration tests covering the functional and non-functional requirements identified in the “Design” phase.

Of course, discrepancies and misconfigurations could also affect the “Security” of the service, but unfortunately most of the time “Security” is not in the scope of this testing as it is hard to define “what” makes the service secure. Luckily, this problem is solved by phase 1 of the threat modeling process, as this gives us a list of threats and their expected mitigations, and we can consider these mitigations as the list of things that need to be true for the service to be “secure”. In other words, phase 1 of the threat modeling process defines “Security” for a service.

With that in mind, we can now include “Security” in the scope of the “Testing” phase so that we can verify that the threat mitigations were implemented correctly.

Micro Pentests: The threat mitigations are the pentest scope

In a sense, verifying the threat mitigations can be considered a pentest, let’s call it a “micro pentest” or a “per-project pentest”. This is a different kind of pentest from the pentests many organizations schedule on an annual or biannual basis, which usually cover a large number of services and features, have ambiguous scopes, take a lot of time and effort, and are performed by external pentesters.

While these holistic pentests are very useful as they help expose any blind spots you have in your security coverage, I would argue that having per-project micro pentests with well-defined scopes adds much more value. This is because micro pentests with the right tooling to perform it efficiently:

Have more context and less ambiguity, and hence can generate findings that are more accurate, relevant, and useful.
Can be performed by your internal security team, or even your developers if you have the right tooling.
Take less time and effort (also if you have the right tooling).
Can be part of the SDLC process.
Can be used to generate automated tests that can be scheduled to run as we will see later.

The cool thing is as mentioned above we have a well-defined pentest scope already written for us if we have completed phase 1 of the threat modeling process. The pentest scope is just to go through the mitigations one by one and verify they are working as expected.

NOTE: To clarify, I am not suggesting micro pentests should fully replace the holistic ones, Ideally you should have both as our process could always have some blind spots which a holistic pentest can help us identify.

Create a testing and automation plan

Okay, so now we have a scope for our micro pentest, the next step is to create the testing plan. For that, we need to go through the mitigations we identified and decide on the best way to verify they are working as expected.

While that provides point-in-time verification that the mitigations are working as expected, it is always possible that future code or configuration changes could cause the mitigations to break (e.g. removing a single line of code caused a pre-auth RCE vulnerability in Metabase as shown in this write-up). Hence, we should also add to our plan automation, so for each mitigation besides deciding on the best way to verify, we will also decide on the best way to automate this verification using DevSecOps tools that can either run in the CI/CD, for new PRs or as a scheduled job running periodically.

Let’s apply that to our example

Let’s take some examples from the mitigations we identified in story 3 (reading the previous story is advised to have enough context, but not a must), going through these mitigations we can find that we have different types of mitigations:

1- Business logic/Code mitigations:

These are the mitigations that usually need to be added as part of the code we are writing for our service. For example:

Threat#6: An attacker can bypass authorization of the REST API to be able to read/write/share customer data.
Mitigation 1: All REST API actions expect the session token to be passed as a header, and validates it. If it is missing or invalid the request is re-directed to the /signin page.
Mitigation 2: All REST API actions will verify the file is either owned by or shared with the authenticated user.

And

Threat#5: An attacker can use CSRF to trick the user to share a file with the attacker’s user.
Mitigation 1: We will use Django’s CSRF middleware for all mutating actions of the REST API.

This kind of mitigations is usually the one that is most likely to break, and hardest to detect, as usually they are specific to the application being developed, and hence security tools won’t be able to detect this with their out-of-the-box configuration. Accordingly, these are the ones I recommend prioritizing in your testing and automation plan.

To verify business logic/code mitigations we usually have 2 approaches:

Code reviews: we go through the code and verify the logic is working as expected.
Dynamic tests: we test the behavior of the application to verify the mitigation is working as expected. For that, we need to use a tool like Postman, Burp suite, or the browser dev tools depending on the situation.

For automation, there are also 2 approaches I can suggest:

1. Custom SAST rules: Some SAST tools support custom rules (e.g. Semgrep), which is a very good way of automating business logic mitigations. For example, the below Semgrep custom rule generates a finding if any of the mutating paths mentioned in story 3 don’t have the expected Djano CSRF middleware @csrf_protect. By adding this to the CI/CD or the PR submit process you can verify new code doesn’t break the CSRF mitigation.

rules:
  - id: ensure-csrf-decorator
    languages: [python]
    message: "Ensure the CSRF decorator is used for sensitive views"
    severity: ERROR
    patterns:
      - pattern: |
          def $FUNC(...):
              ...
      - pattern-not-inside: |
          @csrf_protect
          def $FUNC(...):
              ...
      - metavariable-regex:
          metavariable: "$FUNC"
          regex: "(upload|download|share|unshare|delete)"

NOTE: For more details about the syntax of this rule check Semgrep’s documentation, and if you are interested in a dedicated story about Semgrep’s custom rules let me know in the comments.

2. Writing unit or integration tests to test the behavior: For example, we can test the authentication and authorization behavior by writing integration tests using the testing client included with Django as in the 2 examples shown below (one function verifies that if the session cookie is missing a 401 response is returned, and the other is checking that if a user tries to download a file not shared with the user a 403 response is returned) and adding them to the CI/CD or as a scheduled job. Other tools like Robot Framework can also be used for creating similar tests.

from django.test import TestCase, Client

class ShareAPIViewTestCase(TestCase):
    def setUp(self):
        self.client = Client()

    def test_missing_session_cookie(self):
        """Test that a 401 response is returned when the session cookie is missing"""
        response = self.client.get('/api/share/')
        self.assertEqual(response.status_code, 401)

    def test_valid_cookie_different_user_file(self):
        """Test that a 403 response is returned when a valid cookie for user 1 is provided and the file_id input parameter is a valid id of a file owned by a different user"""
        # Assuming you have a function or method to generate a valid session for a user
        self.client.login(username='user1', password='password1')

        # Assuming you have a function or method to create a file owned by a different user
        file_id = create_file_for_different_user()

        response = self.client.get('/api/share/', {'file_id': file_id})
        self.assertEqual(response.status_code, 403)

2- Use of library/tool mitigations:

Some other threats could be mitigated indirectly through the use of a library or a tool that has a built-in security control. For example:

Threat#3: An attacker can use an XSS vulnerability in the web application to steal the session cookie or the pre-signed url used to upload/download the data, allowing the attacker to read/write customer data.
Mitigation 1: We are using React which performs html encoding by default which mitigates XSS unless a function like dangerouslySetInnerHTML is used.

And

Threat#16: An attacker can use an SQL injection vulnerability to gain access to customer data or credentials used to gain access to customer data.
Mitigation 1: Django’s documentation mentions querysets are protected against SQL injection as they use Parameterization which is also what OWASP recommends on their SQL injection prevention page.

In the above examples, we are using React which has built-in mitigation for XSS, and Django’s QuerySets to call the DB which has built-in mitigation for SQL injection.

This type of mitigations is less complex than the ones related to our code, which is why for verification usually we only need a code review to verify these libraries are being used properly, and that there are no exceptions on the code (e.g. function directly calling the DB instead of using Django’s QuerySets).

Of course, we still need automation to make sure future code changes don’t introduce such exceptions as well, and my suggestion here is relying on a SAST tool that either has out-of-the-box coverage of the type of threat being mitigated (XSS and SQL injection in the above 2 examples are covered by most SAST tools) or supports custom rules that we can create as shown earlier.

3- Configuration mitigations:

Other threats are mitigated by the way we configure the building blocks used by the application (e.g. platform, instance, .. etc). For example:

Threat#1: An attacker that can read from/write to the s3 bucket can expose/tamper with customer data by directly accessing the bucket.
Mitigation 2: Bucket is not set to public.

Also

Threat#2: An attacker with access to the physical storage where the S3 objects are stored can read/tamper with customer data.
Mitigation: Encryption at rest by default is enabled on the s3 bucket.

And

Threat#12: Remote access with no or weak authentication could allow an attacker to remotely login to the instance and access the credentials on the instance (session cookie, pre-signed URL, instance profile role credentials, or DB credentials) leading to customer data exposure/tampering.
Mitigation: All remote access is disabled OR only SSH is enabled through two-factor authentication only allowed to the team through a bastion host.

The above examples are about configuring the S3 bucket, and the EC2 instances we are using for the application in a specific way to protect customer data on these resources, the same could apply to any other building blocks like cloud services, containers, .. etc.

To verify this kind of mitigations, we usually need to verify the configuration of the related resource. This can be done manually, using a script, or using a tool. For the above mitigations, we can manually verify the S3 configuration through the AWS console or using the aws cli, and the EC2 instance configuration can be checked by logging into the server and checking the running services and listening ports.

For automation, we could either use a tool specified in the type of resource related to the mitigation, or we could write our own scripts and integrate them into our automation.

For example, for AWS services there are tools and services that can be used to check the configuration of the AWS resources like ScoutSuite, and AWS Config. Also, if you are using an IaC (Infrastructure as Code) tool like Cloudformation or Terraform, there are tools that can be used to check the configuration in the IaC templates, like cfn_nag for Cloudformation, or checkov which supports both Cloudformation and Terraform.

NOTE: I recommend choosing a tool that supports custom rules or checks so that we can adapt the rules to the specific checks and resources relevant to our threat model. e.g. The service in story 3 had 2 buckets, one meant to be public (used for the web application assets), and one must be private (used for customer data), so we only want to scan the second one for whether it is private or not.

Here is an example of a custom checkov rule, that checks if the customer data bucket (identified by the tag type:customer_data) is public. For more info on how to write and use checkov customer rules, you can check checkov’s documentation.

metadata:
  id: "CKV_CUSTOM_1"
  name: "Ensure customer_data S3 bucket is not public"
  category: "S3"
definition:
  and:
    - cond_type: attribute
      resource_types:
        - "AWS::S3::Bucket"
      attribute: "Properties.AccessControl"
      operator: equals
      value: "PublicRead"
    - cond_type: attribute
      resource_types:
        - "AWS::S3::Bucket"
      attribute: "Properties.Tags.type"
      operator: equals
      value: "customer_data"

For the configuration of the instance, you can use other tools like Amazon Inspector, AWS SSM, or Inspec depending on what you want to verify. For, example if the SSM agent is configured on the instances we are using, we can use the below SSM command to check if the SSH service is running, and we can integrate that with our automation to run periodically.

aws ssm send-command --document-name "AWS-RunShellScript" \
--document-version "\$DEFAULT" \
--targets "Key=instanceids,Values=instance-id-1,instance-id-2" \
--parameters 'commands=["sudo service ssh status"]'

4- Process mitigations:

Lastly, some threats are mitigated by processes that need to be enforced. For example:

Threat#10: A vulnerability affecting the OS or the packages installed on the Application Servers that is reachable by user input could allow an attacker to access the credentials on the instance (session cookie, pre-signed URL, instance profile role credentials, or DB credentials) leading to customer data exposure/tampering.
Mitigation 1: The OS and the packages are periodically patched and upgraded to the latest versions.

And

Threat#13: A vulnerability affecting a library used by the application that is reachable by user input could allow an attacker to access the credentials on the instance (session cookie, pre-signed URL, instance profile role credentials, or DB credentials) leading to customer data exposure/tampering.
Mitigation 1: All libraries used by the application are periodically being upgraded to the latest version.

The above threats are mitigated by the fact that we have a process for patching OS packages and open-source libraries in our code. Verifying a process is a different kind of task from the previous types of mitigations because this is more about governance.

However, what can be verified and automatically tested is the outcome of this process. In the above 2 examples, we can use Amazon Inspector to continuously monitor outdated OS packages, and we can use any SCA tool (e.g. safety or pip-audit which are open-source SCA tools for Python) to periodically scan the open-source libraries and the dependencies used by our code. This way we can detect if these processes are not being followed.

Let’s wrap it up

Below is a mind map that summarizes the different types of mitigations and the suggested ways and tools mentioned in this story to verify them and to create automated checks to do the verification continuously.

The different types of mitigations and the ways and tools mentioned in this story to verify them and to create automated checks to do the verification continuously

This is not meant to be a complete list and is only relevant to the service we discussed in story 3, but the goal is to show the thought process behind creating the testing and the automation plan based on the mitigations we had identified. You can follow a similar approach for the service you are reviewing and choose the right tools to create the tests needed for the mitigations you have identified.

Once you have created the testing and automation plan all that is left is executing these plans which could be a joint effort between the security and engineering teams.

Suggested Exercise: Go through the rest of the mitigations in story 3 which we haven’t discussed here, identify the type of mitigation, and create a testing and automation plan for verifying these mitigations. Let me know in the comments if you face any issues.

Conclusion

This has been a long journey, but now we have completed all the 3 phases of the threat modeling process. We have identified the threats, and their corresponding mitigations, and for each mitigation we have created a plan to verify the mitigation and create an automated test to continuously verify it, then we executed this plan.

This is a much better approach than starting with running the security tools and diving into their findings, as identifying the threats and their mitigations first gave us clear context and helped us define what we need to focus on, this way we can still use the security tools but in a focused way that gives us much more useful output.

Stay tuned for the next story where we discuss how to optimize this threat modeling process so that it can be scalable and feasible to add to our SDLC.