Steampipe — Multi-Cloud Compliance as a Code

Alex Anto
6 min readAug 29, 2022

--

Cloud compliance is the general principle that the systems must be compliant with standards that the customers face.

  • Cloud adoption has grown rapidly, and most workloads have already been moved to the cloud. We have well-known Cloud providers with various in-built services to enhance the availability, scalability, performance and security.
  • There are best practices and security standards that need to be followed in a well-protected cloud infrastructure.
  • By saying that, if you are not following it, you are not securing the infrastructure, thereby allowing an intruder to challenge your infrastructure.
  • It is helpful to track complaints with each cloud provider’s inbuilt complaint tool. But, for a multi-cloud setup, a centralized tool has enhanced visibility and keeps better track of the entire infrastructure.

What is Steampipe?

It is an open source software for interrogating your cloud. Run SQL queries, compliance controls and full governance benchmarks from the comfort of the CLI.

Advantages:

  • Multi cloud compliance
  • 300+ various data sources
  • Open Source
  • Easily writable sql queries.
  • Multi format support such as tables, json, html and csv.
  • Custom Benchmarks
  • Multi services compliance checks including k8s
  • Strong community support

How steampipe works:

Steampipe leverages PostgreSQL Foreign Data Wrappers to provide a SQL interface to external services and systems. It uses an embedded PostgreSQL database (currently, version 14.2.0), and you can use standard Postgres syntax to query Steampipe. By default, when you run steampipe query, it will start the database and shut it down at the end of the query command or session. The database only listens on the loopback address.

  1. Installation and configuration:
##Linux
sudo /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)“
Version check
$ steampipe –v
steampipe version 0.15.4
##Windows
Support running Steampipe on Windows 10 via Windows Subsystem for Linux (WSL 2.0.

2. Authn & Authz

Steampipe uses the default methods of getting credentials from the credential file and/or environment variables. Uses the keys present at default location (.aws/config, .oci/config etc.) or environment variables preset in the current user’s account.

  • Some steampipe terminologies
locals - Locals are internal, module level variables.mod   - The mod block contains metadata, documentation, and dependency data for the mod.query - Queries define common SQL statements that may be used alone or referenced by arguments in other blocks like reports and actions.control - Controls provide a defined structure and interface for queries that draw a specific conclusion (eg. 'ok', 'alarm') about each row.benchmark - Benchmark provides a mechanism for organizing controls into hierarchical structures.variable - Variables are module level objects that essentially act as parameters for a module.

3. Plugins

  • Plugins extend Steampipe to work with many different services and providers. Wide variety of plugins available at https://hub.steampipe.io/plugins
  • aws, azure, gcp, oci, alicloud, github, terraform etc
  • To install a plugin, execute,
  • steampipe plugin install aws
$ steampipe plugin install awsaws                  [====================================================================] DoneInstalled plugin: aws@latest v0.74.0
Documentation: https://hub.steampipe.io/plugins/turbot/aws

4. Access steampipe and run the sql queries

  • Interactive Query Shell - Run steampipe query from the installed machine.
  • Each of the plugins contains the sql tables. To List all the available tables, use .tables in steampipe query.
  • To inspect the fields in the tables, use .inspect <tablename>
  • To query from specific table, select * from oci_core_instance;
  • To query with where conditions, select * from aws_ec2_instance where instance_type=’t2.micro’ and instance_state=’running’

5. Queries with CASE expression

  • Steampipe required the queries to be designed with case expression. This is required to get the final report in the proper format.
  • Example, To check monitoring enabled for all the instances,
select
-- Required Columns
i.arn as resource,
case
when i.monitoring_state = 'enabled' then 'ok'
else 'alarm'
end status,
case
when i.monitoring_state = 'enabled' then i.instance_id || ' detailed monitoring enabled.'
else i.instance_id || ' detailed monitoring disabled.'
end reason,
-- Additional Dimensions
i.region,
i.account_id
from
aws_ec2_instance as i
where
i.instance_state = 'running'

6. Controls

  • Controls — It use queries to gather the data. We can keep queries to be addressed using the controls and collate together.
  • To proceed with controls, we need to create a directory and do the mod initialization. Mod initialization will create a mod.sp file, and it will help to run the steampipe queries from the terminal

Mod — It is a portable, versioned collection of related Steampipe resources such as dashboards, benchmarks, queries, and controls. Steampipe mods and mod resources are defined in HCL, and distributed as simple text files.

  • Take the previous monitoring query as the example and how to align it into a control statement.
1. Create a directory in the terminal and initialize the mod
$ mkdir steampipe-aws
$ cd steampipe-aws/
$ steampipe mod init
Created mod definition file '/home/ec2-user/steampipe-aws/mod.sp'
$ cat mod.sp
mod "local" {
title = "steampipe-aws"
}
2. Create the control file with name ec2_monitoring.sp. File extension should be .sp and use below content. Make sure that the sql queries are in the right format. control "ec2_monitoring" {
title = "EC2 Monitoring is Enabled/Disabled Alert"
sql = <<EOT
select
-- Required Columns
i.arn as resource,
case
when i.monitoring_state = 'enabled' then 'ok'
else 'alarm'
end status,
case
when i.monitoring_state = 'enabled' then i.instance_id || ' detailed monitoring enabled.'
else i.instance_id || ' detailed monitoring disabled.'
end reason,
-- Additional Dimensions
i.region,
i.account_id
from
aws_ec2_instance as i
where
i.instance_state = 'running'
EOT
}
  • To run the control,
$ steampipe check control.ec2_monitoring
  • We can arrange all the queries into a single folder and call required queries using the control. Create a folder called query under the current directory and create the sql query inside that with .sql extension.
[ec2-user@ip-172-31-87-235 steampipe-aws]$ cat ec2_monitoring.sp
control "ec2_monitoring" {
title = "EC2 Monitoring is Enabled/Disabled Alert"
sql = query.ec2_monitoring.sql
}
[ec2-user@ip-172-31-87-235 steampipe-aws]$
[ec2-user@ip-172-31-87-235 steampipe-aws]$ tree
.
├── ec2_monitoring.sp
├── mod.sp
└── query
└── ec2_monitoring.sql
1 directory, 3 files
  • Run the command steampipe check control.ec2_monitoring and get the same output.

7. Benchmarks

  • The benchmark block provides a mechanism for grouping controls into control benchmarks, and into sections within a control benchmark​
  • Considering the previous example, create a file called benchmark.sp and move controls into it. Also, create the children section to mention the control.
  • We can add one or more control over there.
$ cat benchmark.sp
benchmark "cis_ec2_instances" {
title = "1 EC2 instances complaince checks"
children = [
control.ec2_monitoring
]
}
control "ec2_monitoring" {
title = "EC2 Monitoring is Enabled/Disabled Alert"
sql = query.ec2_monitoring.sql
}

Run the commands,

$ steampipe check benchmark.cis_ec2_instances

8. Other formats

  • By default, the output format is in table. We can use multiple extensions like html, json and csv to generate the report and send it to the stake-holders. Integrate with a CI job, and you can configure to send the report to email as well as other notification channels.
$ steampipe check benchmark.cis_ec2_instances --output json
$ steampipe check benchmark.cis_ec2_instances --output html
$ steampipe check benchmark.cis_ec2_instances --output csv

9. Community Support

--

--