MetaHub integration with Prowler as a local scanner for context enrichment

Gabriel
13 min readMar 24, 2023

--

Welcome to this series on practical and common use cases of MetaHub. This article serves as an introduction to the series. The articles will focus on the following topics:

  • Part I, MetaHub integration with Prowler as a local scanner for context enrichment
  • Part II, Automatic Security Hub findings suppression using MetaTags
  • Part III, Utilizing MetaHub as an AWS Security Hub Custom Action
  • Part IV, Generating Custom Enriched Dashboards
  • Part V, AWS Security Hub Insights based on MetaChecks and MetaTags.

MetaHub integration with Prowler as a local scanner for context enrichment

Table of Contents:

  • About MetaHub
  • About Prowler
  • Solution Overview
  • Running Prowler
  • Running MetaHub
  • Checks vs. Affected Resources
  • Context Enrichment

About MetaHub

MetaHub is a tool I created to enhance AWS Security Hub, a fully closed AWS service, by leveraging its three main features: ASFF Security Findings Standardization, Security Findings Workflow Status Management, and Security Scanners Integrations (using API). MetaHub is a contextualization tool, not a scanner.

In addition to these features, MetaHub adds:

  • Security Findings Context Enrichment based on Custom Checks (MetaChecks), CloudTrail (MetaTrails), and Tagging (MetaTags)
  • Enriched Filtering on top of MetaChecks, MetaTrails, and MetaTags
  • Automation using MetaChecks, MetaTrails, and MetaTags
  • Enriched and Customizable Dashboarding
  • Ability to work with ASFF outputs without AWS Security Hub.

The main objective of MetaHub is to extend the functionality of AWS Security Hub and provide users with more control over their security findings.

About Prowler

We will use Prowler as a scanner for our solution, widely recognized as one of the most comprehensive tools for detecting AWS security misconfigurations and compliance issues. Prowler is open source, has a large community of contributors, and is regularly updated. Recently, the full source code was refactored from Shell to Python3, making it even faster and more powerful. Prowler follows a “keep-it-simple” approach, excelling at its primary function of providing over 200 different checks. Cloning Prowler is all it takes to get started. The output can be obtained in JSON format and easily integrated with other tools for visualization, alerting, and more. Additionally, Prowler supports the ASFF output format, making it an excellent choice for integration with AWS Security Hub and MetaHub.

While vulnerabilities are a concern, misconfigurations are still the biggest player in cloud security incidents and, therefore, should be one of the greatest causes for concern in organizations. By 2023, 75% of security failures will result from inadequate management of identities, access, and privileges, up from 50% in 2020, according to Gartner.

Solution Overview

MetaHub can read ASFF security findings in different ways:

Locally as an ASFF file input:

Providing one or more ASFF files as input to the tool.

Reading/Writing from AWS Security Hub:

Using AWS Security Hub API and interacting with it for reading, filtering, updating, and enriching findings.

Combined:

Combining AWS Security Hub and ASFF as file inputs together

For this article, we will follow the first diagram; we will run Prowler locally (without AWS Security Hub), providing ASFF data to MetaHub as a file input. This approach is the easiest to implement as it doesn’t require any additional service, but it is also limited as you can’t rely on AWS Security Hub functionality like filters ( — sh-filters) or updates ( — update-findings).

Running Prowler

To scan our infrastructure using Prowler and generate a list of security findings, we need to follow a few steps.

Since this article is focused on a simple setup, we will only scan one AWS account without assuming any roles or using AWS Organizations configuration. However, it’s worth mentioning that Prowler and MetaHub support different multi-account setups through assuming roles or fetching from AWS Organizations. Check their documentation or use the “ — help” command for more information.

Here’s how to run Prowler:

1. Clone and install Prowler and its dependencies:

git clone https://github.com/prowler-cloud/prowler
cd prowler
poetry shell
poetry install

2. Log in to AWS by either exporting your keys to env or using aws configure

3. Run Prowler:

For our solution, we need to set two extra parameters to Prowler:

  • -M json-asffWe want Prowler to generate the non-default output mode json-asff. If we don't provide this parameter, Prowler will generate only json, csv, and html.
  • -q You can also customize if you want to get only FAILED (or non-compliant) results or all of them (including PASSED checks). We just need the FAILED checks.
python3 prowler.py -M json-asff json html csv -q

After it finishes, you will end up with four output files: HTML, CSV, JSON, and ASFF. These files contain your security findings in four different formats. You can explore these outputs right away. Take note of the file paths.

Running MetaHub

The next step is to run MetaHub to process the security findings generated by Prowler. Follow the steps below:

1. Clone and Install MetaHub and dependencies:

Clone the MetaHub repository and install its dependencies in a Python virtual environment using the commands below:

git clone git@github.com:gabrielsoltz/metahub.git
python3 -m venv venv/metahub
source venv/metahub/bin/activate
pip3 install -r requirements.txt

2. Optional (for now): Log in to AWS. Although we don’t need to be logged in to AWS at this stage, we’ll need it later. You can log in by exporting your keys to the environment or using aws configure.

3. Run MetaHub:

By default, MetaHub fetches security findings from AWS Security Hub, but in this case, we want to use the Prowler ASFF output file as the source. To do this, use the following parameters:

  • --inputs file-asff: This specifies the input source as ASFF file.
  • - input-asff <asff_file_path>: This specifies the path to the Prowler ASFF output file.

MetaHub offers several options to customize its output, including:

— list-findings: This option generates a JSON output in the console.

— outputs {short,full,inventory,statistics}: This option specifies how to organize the data. You can combine more than one using spaces.

— output-modes {json,html,csv}: This option specifies the output format. By default, it is set to JSON.

./metahub --inputs file-asff --input-asff /Users/gabriel/repos/prowler/output/prowler-output-01234567890-20230210163021.asff.json --output-modes csv html json

Checks vs. Affected Resources

MetaHub provides a helpful feature to contextualize security findings by analyzing duplication, shadowing, and affected resources. While Prowler and other Security Scanners generate a list of failed/passed checks executed one by one, resulting in multiple findings for the same problem.

This is because each check/vulnerability has its own logic, and you might want to know only about a specific one. For instance, Prowler can produce two security findings for an open Security Group for port 22(SSH)/TCP that is not attached, but the affected resource is only one Security Group.

Similarly, Web Scanners might report more than one finding for TLS Certificates not using the most secure configurations, with the affected resource being the Web Certificate or the Web Server. A Server Patching scanner could report multiple vulnerabilities for the same outdated software version, where the affected resource is the software version or the Server.

To address this issue, MetaHub deduplicates findings by the affected resource, presenting them together and allowing them to be treated as a single finding, thereby simplifying management. This becomes especially useful when combining multiple scanners, where findings related to a single resource can be analyzed together. For example, you might have 2 Prowler findings related to an EC2 instance, along with 10 Nessus Tenable findings about software vulnerabilities on that instance. MetaHub can analyze these twelve security findings together, facilitating the creation of a comprehensive security ticket.

You will see this clearly in the main output of each tool.

Prowler reported 695 failed checks:

From that input, MetaHub reported 695 failed checks on 254 affected resources:

JSON (or CLI) Output

When you examine the MetaHub JSON output file, you can more easily see how the grouping of the findings occurs.

To enable CLI output in MetaHub, use the option -list-findings, which will display the JSON output in the console.

The output for a single affected resource (in this case, an EC2 instance) with four findings is grouped together under the “findings” section and displayed beneath the ARN of the affected resource. The output also includes the AwsAccountId, AwsAccountAlias, Region, and ResourceType.

Using the option -outputs, you can experiment with other MetaHub output modes, such as full, inventory, and statistics.

"arn:aws:ec2:eu-west-1:1234567890:instance/i-0f3ac8dbb261aabeb": {
"findings": [
"Check if EC2 Instance Metadata Service Version 2 (IMDSv2) is Enabled and Required.",
"Check for internet facing EC2 instances with Instance Profiles attached.",
"Check if EC2 instances are managed by Systems Manager.",
"Check for EC2 Instances with Public IP."
],
"AwsAccountId": "1234567890",
"AwsAccountAlias": "MyAWSAccount",
"Region": "eu-west-1",
"ResourceType": "AwsEc2Instance"
},

HTML Output

The same approach is applied to the HTML output, where findings are grouped under the affected resources.

Context Enrichment

Context enrichment is a powerful feature of MetaHub that provides valuable information about affected resources beyond what the scanners can detect. When dealing with an affected resource, it’s essential to answer questions such as:

  • How can it be identified? What is the resource’s name, IP, or DNS?
  • Who created the resource, and when was it created?
  • Which service is the resource part of, and is it in staging or development?
  • Is the resource in production, or is it connected to production resources?
  • Is the finding a false or true positive in the context of the affected resource?
  • Does the reported severity level from Prowler make sense in the context of the affected resource?
  • Is the resource running, or is it stopped?
  • Is the resource effectively public?
  • Is the resource effectively encrypted?

MetaHub queries the affected resources directly in the affected account to provide additional context using the following options:

  • MetaTags (--meta-tags): Queries tagging from affected resources
  • MetaTrails (--meta-trails): Queries CloudTrail in the affected account to identify who created the resource and when, as well as any other related critical events
  • MetaChecks (--meta-checks): Fetches extra information from the affected resource, such as whether it is public, encrypted, associated with, or referenced by other resources.

To use MetaHub’s context enrichment feature, you must be logged in to the affected account with a role or user with the appropriate privileges. If you are running MetaHub across multiple accounts, you can use the -mh-assume-role option to assume a role in the affected AWS account. You can read more here.

Analyzing Context

Let’s run MetaHub again, this time with MetaTags, MetaTrails, and MetaChecks enabled, and also with the — list-findings option so we can see the results in the CLI:

./metahub --meta-checks --meta-tags --meta-trails --inputs file-asff --input-asff /Users/gabriel/repos/prowler/output/prowler-output-01234567890-20230210163021.asff.json --output-modes html json csv --list-findings

Let’s check the same affected EC2 instance again:

"arn:aws:ec2:eu-west-1:1234567890:instance/i-0f3ac8dbb261aabeb": {
"findings": [
"Check if EC2 Instance Metadata Service Version 2 (IMDSv2) is Enabled and Required.",
"Check for internet facing EC2 instances with Instance Profiles attached.",
"Check if EC2 instances are managed by Systems Manager.",
"Check for EC2 Instances with Public IP."
],
"AwsAccountId": "1234567890",
"AwsAccountAlias": "MyAWSAccount",
"Region": "eu-west-1",
"ResourceType": "AwsEc2Instance"
},

In addition to the previous fields, you will now see three new sections:

MetaTags

Tags are a powerful tool for understanding your context, as they provide a way to label and organize your AWS resources. This section lists all available tags for the affected resource. Let’s take a closer look at the tags for the EC2 instance we’re analyzing:

"metatags": {
"aws:autoscaling:groupName": "stg-payments",
"environment": "stg",
"terraform": "true",
"aws:ec2launchtemplate:version": "1",
"aws:ec2launchtemplate:id": "lt-06b73d2e77f10446f",
"Name": "stg-payments",
"Owner": "MalditOps",
"Service": "Payments"
}

As you can see, the tags provide valuable information about the resource. Tagging strategies often include the following categories:

  • The Name: In AWS, the name is typically defined as a tag for resources, providing an easy way to identify resources at a glance.
  • The Service name: This tag helps to categorize resources by the service they belong to.
  • Environment: This tag is useful for differentiating between resources in different environments, such as Production, Staging, and Development.
  • Data classification: This tag labels resources based on the sensitivity of the data they handle, such as Confidential or Restricted.
  • Owner: This tag can help identify the team or business unit responsible for a resource.
  • Compliance: This tag is useful for identifying resources subject to specific compliance requirements, such as PCI.

MetaTrails

MetaTrails is a MetaHub feature that leverages CloudTrail to retrieve critical events related to the affected resource, such as creation events.

Critical events are defined by ResourceType under config/resources.py

The MetaTrails section of the output provides details about these events, such as the username that triggered them and the timestamp of the event.

"metatrails": {
"RunInstances": {
"Username": "root",
"EventTime": "2023-02-25 15:35:21-03:00"
}
}

By analyzing the events captured in CloudTrail, you can gain valuable insights into the history and usage of the affected resource.

MetaChecks

lastly, MetaHub is equipped with additional checks called MetaChecks, which can be run directly on affected resources. At present, MetaHub includes 106 pre-defined checks, but customs checks can also be added. By running MetaChecks, you can retrieve further context from your resources. For instance, you can examine the resources to which a particular resource is attached and perform checks on those resources to provide a comprehensive view.

What sets MetaChecks apart is that they answer with either True or False. However, if the answer is True, MetaChecks also furnish additional data. For instance, if the MetaCheck it_has_public_ip returns True, it will also provide the public IP. MetaChecks can be filtered, and MetaHub can display only the resources with a public IP by using the filter “ — mh-filters-checks it_has_public_ip=True.”

  "metachecks": {
"it_has_public_ip": "55.55.55.55",
"it_has_private_ip": "172.1.1.1",
"it_has_key": "eu-west",
"it_has_private_dns": "ip-172-1-1-1.eu-west-1.compute.internal",
"it_has_public_dns": "ec2-55-55-55-55.eu-west-1.compute.amazonaws.com",
"it_has_instance_profile": "arn:aws:iam::1234567890:instance-profile/eu-west-1-stg-payments-iam-profile",
"it_has_instance_profile_roles": "arn:aws:iam::1234567890:role/eu-west-1-stg-payments-iam-role",
"its_associated_with_security_groups": [
"sg-020cc749a58678e05",
"sg-0654072398ccc9c1b"
],
"its_associated_with_security_group_rules_unrestricted": [
{
"SecurityGroupRuleId": "sgr-0e6cd39169dc137ab",
"GroupId": "sg-020cc749a58678e05",
"GroupOwnerId": "1234567890",
"IsEgress": false,
"IpProtocol": "tcp",
"FromPort": 22,
"ToPort": 22,
"CidrIpv4": "0.0.0.0/0",
"Tags": []
}
],
"its_associated_with_security_group_rules_egress_unrestricted": [],
"is_instance_metadata_v2": false,
"is_instance_metadata_hop_limit_1": true,
"its_associated_with_ebs": [
"vol-05b789518e476522e",
"vol-0a45e1ac1a2ca73c0"
],
"its_associated_with_ebs_unencrypted": [
"vol-0a45e1ac1a2ca73c0"
],
"its_associated_with_an_asg": "stg-payments-20201205160228428400000002",
"its_associated_with_an_asg_launch_configuration": false,
"its_associated_with_an_asg_launch_template": {
"LaunchTemplateId": "lt-06b73d2e77f10446f",
"LaunchTemplateName": "stg-payments-20221125162630246900000001",
"Version": "1"
},
"is_public": "55.55.55.55",
"is_encrypted": false,
"is_running": true
},

For this EC2 instance, we know it is associated with 2 Security Groups, and we are going deeper into checking that Security Group configuration to understand the rules. In this case, now we know that this instance is exposed to the Internet in port 22/tcp.

"its_associated_with_security_groups": [
"sg-020cc749a58678e05",
"sg-0654072398ccc9c1b"
],
"its_associated_with_security_group_rules_ingress_unrestricted": [
{
"SecurityGroupRuleId": "sgr-0e6cd39169dc137ab",
"GroupId": "sg-020cc749a58678e05",
"GroupOwnerId": "1234567890",
"IsEgress": false,
"IpProtocol": "tcp",
"FromPort": 22,
"ToPort": 22,
"CidrIpv4": "0.0.0.0/0",
"Tags": []
}
],
"its_associated_with_security_group_rules_egress_unrestricted": [],

We are also checking the EBSs this instance it’s associated with to check if any of those are unencrypted:

"its_associated_with_ebs": [
"vol-05b789518e476522e",
"vol-0a45e1ac1a2ca73c0"
],
"its_associated_with_ebs_unencrypted": [
"vol-0a45e1ac1a2ca73c0"
],

You can understand if the instance is part of an Auto Scaling group configuration:

"its_associated_with_an_asg": "stg-payments-20201205160228428400000002",
"its_associated_with_an_asg_launch_configuration": false,
"its_associated_with_an_asg_launch_template": {
"LaunchTemplateId": "lt-06b73d2e77f10446f",
"LaunchTemplateName": "stg-payments-20221125162630246900000001",
"Version": "1"
},

Or what IAM roles it’s associated with:

"it_has_instance_profile": "arn:aws:iam::1234567890:instance-profile/eu-west-1-stg-payments-iam-profile",
"it_has_instance_profile_roles": "arn:aws:iam::1234567890:role/eu-west-1-stg-payments-iam-role",

And a lot of other useful information:

"it_has_public_ip": "55.55.55.55",
"it_has_private_ip": "172.1.1.1",
"it_has_key": "eu-west",
"it_has_private_dns": "ip-172-11-12-192.eu-west-1.compute.internal",
"it_has_public_dns": "ec2-55-55-55-55.eu-west-1.compute.amazonaws.com",

Lastly, two magic MetaChecks are defined for all Resources: is_public and is_encrypted.These MetaChecks are populated based on other MetaChecks but are defined for all Resource Types, so it’s possible to filter by them across any AWS ResourceType.

To obtain a comprehensive list of all publicly accessible resources that are unencrypted, a MetaHub query can be executed in the following manner:

./metahub --list-findings --meta-checks --mh-filters-checks is_public=True is_encrypted=False

For example, for EC2 Instances,

  • is_public: It’s True if MetaChecks it_has_public_ipand its_associated_with_security_groups_rules_ingress_unrestricted are True.
  • is_encrypted: It’s True if MetaCheck its_associated_with_ebs_unencrypted is False.
"is_public": "55.55.55.55",
"is_encrypted": false,

With all this information, you can better understand what this affected resource is doing in your context. This instance is not only affected by the four findings reported by Prowler, but we also went further. We found that it is associated with Security Groups open and EBS unencrypted (different affected resources connected between them), making the finding much more important based on its context.

Once you’ve grasped the potential of MetaHub to enhance your Security Findings, you can leverage its capabilities to automate actions based on various fields. Check these examples:

  • Combine Prowler findings with other scanners in the same output
  • Automatically create security tickets with context and eliminate duplicates for Prowler findings, generating one ticket per problem instead of per check.
  • Assign Prowler findings owners based on MetaTags or MetaTrails
  • Set up alerts for Prowler findings on specific services or environments based on MetaChecks or MetaTags
  • Send Prowler security findings directly to the person who created the resource via Slack Messages using MetaTrails.
  • Filter Prowler findings by Service, Owner, Team, or any MetaCheck or MetaTag, and generate HTML or CSV reports.
  • Utilize the is_public MetaCheck output to feed Prowler into network scanners like Nmap.
  • Create a customized dashboard using the HTML output mode and the options — output-meta-tags-columns and — output-meta-checks-columns.

Happy Hunting!

--

--