0 to 1 Billion with StreamAlert
Update: For Enterprise-grade security monitoring, check out the successor to StreamAlert, called Panther!
Since the announcement post in January of 2017, StreamAlert has accumulated over 1700 commits, 1200 stars, 500 PRs merged, and contributions from 15 members of the open-source community.
The project has matured significantly since then, and today I am here to explain how you can go from 0 to 1 Billion (logs) with StreamAlert.
Disclaimer: This guide assumes basic knowledge/experience with Python development and Amazon Web Services.
What is StreamAlert?
StreamAlert is a Serverless, real-time, data analysis engine based on AWS services such as Lambda and Kinesis. StreamAlert is great at analyzing a high throughput of data, classifying to defined types, evaluating rules, and delegating alerts to their proper destinations.
StreamAlert can analyze data from any aspect of an environment. This includes from the cloud/environment, network, operating system, or an application. If you can send to Kinesis, S3, or SNS, it can be consumed by StreamAlert.
This guide will focus on Kinesis Streams, as it scales trivially when paired with Lambda.
Getting Started
First thing’s first, let’s get Python setup.
We will need Python2.7
and pip
. I recommend virtualenv, or a package management tool like virtualenvwrapper or conda. The purpose of using a virtual environment is that you can isolate dependencies to use within specific projects.
Once your Python environment is setup to your liking, download and install Terraform, which is an infrastructure management tool by Hashicorp. At the time of this writing, the current version is 0.11.7
. Terraform is used to create all the necessary AWS infrastructure to support Billions of logs per day.
Clone StreamAlert:
$ git clone https://github.com/airbnb/streamalert.gitor$ git clone --branch stable https://github.com/airbnb/streamalert.git
We are going to live on the edge and use the master
branch. If you want to ensure a completely stable environment, use the stable
branch.
Install dependencies and activate your environment (w/ virtualenv):
$ cd streamalert
$ pip install --user virtualenv
$ virtualenv -p python2.7 venv
$ source venv/bin/activate
$ pip install -r requirements.txt
To verify things are working as expected, try running unit tests locally with ./tests/scripts/unit_tests.sh
.
Planning Your Deployment
Before anything is deployed, make sure you have an AWS account setup and the preliminary config options set such as prefix
and account_id
. Once that’s squared away, it’s time to think about where your data is.
Does your data originate from other AWS managed services like CloudTrail or CloudWatch? If so, they can be directly streamed to Kinesis.
Or maybe your data exists on hundreds/thousands of virtual machines or physical hosts. If so, I would recommend learning about how tools like Fluentd or the Kinesis Agent work.
Once you have an understanding of where the data is, we can design a cluster and set of schemas for your deployment.
Clusters
StreamAlert can be segmented into clusters, which are sets of ingestion and processing infrastructure. Each module in a cluster represents a piece of the overall infrastructure. Modules can also be mixed and matched to meet your specific needs.
For this example, let’s use a single cluster with a Kinesis stream as the main delivery mechanism for data processing:
# conf/clusters/prod.json{
"id": "prod",
"modules": {
"cloudwatch_monitoring": {
"enabled": true
},
"kinesis": {
"streams": {
"create_user": true,
"retention": 24,
"shards": 12
}
},
"kinesis_events": {
"batch_size": 500,
"enabled": true
},
"stream_alert": {
"rule_processor": {
"current_version": "$LATEST",
"enable_metrics": true,
"log_level": "info",
"memory": 128,
"timeout": 60
}
}
},
"region": "us-east-1"
}
Let’s break down this file:
- A Kinesis based cluster called
prod
. - Located in the
us-east-1
region. - The Kinesis Stream has
12
shards with a retention period of24
hours. - The Stream has a max throughput of 1+ Billion logs per day:
1,000 records/sec * 12 shards * 86,400 sec/day = 1,036,800,000 records/day.
- Kinesis Events are enabled, which means any records sent into this Stream will be processed by the StreamAlert rule processor.
- The rule processor Lambda function will run at a maximum of
60
seconds, using128MB
of memory, and will process up to500
records from Kinesis in a single invocation. - CloudWatch Monitoring is enabled on Lambda and Kinesis, which ensures that an alarm will be created if anything goes wrong.
Build
To create this cluster and initialize your StreamAlert deployment, run:
$ python manage.py terraform init
At each prompt, type yes
. Terraform will create all the necessary infrastructure to ingest, analyze, and alert upon data sent to this cluster’s Kinesis Stream.
Sending Data
To keep things simple, let’s utilize a built-in StreamAlert module to capture AWS account activity. Built-in StreamAlert rules can also be utilized to detect suspicious activity within an AWS account.
Open the conf/sources.json
file, and modify it to account for incoming CloudWatch/CloudTrail data:
{
"kinesis": {
"<YOUR-PREFIX>_prod_stream_alert_kinesis": {
"logs": [
"cloudwatch"
]
}
}
}
Add the following to the conf/clusters/prod.json
file (as seen above) under the modules
section:
"cloudtrail": {
"enable_logging": true,
"enable_kinesis": true
},
And then run the following command to set it up:
$ python manage.py lambda deploy --processor rule
$ python manage.py terraform build --target cloudtrail
To verify data is sending to Kinesis, open the CloudWatch Kinesis metrics browser and load up the IncomingRecords
metric on your Kinesis Stream:
The main rule processor Lambda function can also be observed to verify invocations are occurring without error:
Alerting on Activity
Now that StreamAlert is bootstrapped and receiving data from a Kinesis Stream containing AWS API events, end-to-end flows can now be verified by triggering certain actions within the account.
By default, StreamAlert ships with built-in rules for CloudTrail. Trigger an alert by configuring a Security Group to allow SSH from anywhere in the world (please undo this after):
And a couple of minutes later, the alert will be searchable in the Athena Console:
SELECT id, rule_name, log_source,
json_extract_scalar(
record['detail'], '$.eventname') AS event_name,
json_extract_scalar(
record['detail'], '$.eventsource') AS event_source,
json_extract_scalar(
record['detail'], '$.useridentity.username') AS usernameFROM "<YOUR-PREFIX>_streamalert"."alerts"WHERE rule_name = 'cloudtrail_security_group_ingress_anywhere';
What’s Next
Congratulations! If you have made it this far, it means you have a functional and working setup of StreamAlert, and are on your way to analyzing a Billion of logs per day.
This current implementation can be scaled even further by adding additional log sources into Kinesis, and more rules to add value to your monitoring pipeline.
This guide has only scratched the surface of what is possible with StreamAlert. In upcoming posts, I will go deeper on many of the concepts introduced here.
To stay connected with the StreamAlert community, use #streamalert on Twitter, and join us on Slack. If development of StreamAlert is interesting and appealing to you, Airbnb is hiring security engineers!