Easy analytics with AWS Quicksight, Kinesis and Terraform (Part 1)

Jack Mahoney
Feb 19 · 5 min read
Streaming data events from your application to an analytics dashboard is easy with Quicksight.

Having a way to view and analyse metrics from your application is essential for monitoring product-health and engagement. There are tons of solutions out there, from GoogleAnalytics to MixPanel, but for my latest project I chose AWS Quicksight. It lets you manage your own data (in S3) and plays well with existing AWS services. Also, when compared with other offerings it is essentially free.

The scenario

What my application does to illustrate the events I want to analyse

I recently started rebuilding a SAAS product called MailSlurp. It’s a REST API that let’s developers send and receive emails in code from ephemeral inboxes. It is written as a series of Kotlin Spring microservices and uses AWS and Terraform extensively: namely SQS, SNS, SES, ElasticBeanstalk and S3.

I wanted to chart user engagement and key events that occur in my applications life cycle — such as emails sent, emails received, user signups, user interaction. I also wanted to do complex analysis on this data at a later point. This requirement meant platforms like Google Analytics and MixPanel weren’t appropriate.

Because I’m quite familiar with AWS, I decided to give Quicksight a try.

How does Quicksight work?

An example of what AWS Quicksight dashboards look like

Essentaily, Quicksight is an AWS webapp that allows you to build charts and dashboards that query JSON that has been stored in S3 in a particular structure. At first this sounds a little odd — no database? — but it actually gives you a lot of flexibility. As long as you use the correct folder structures (representing date and time) you can simply save JSON objects to S3 for each application event and query them later from Quicksight.

Luckily the event → S3 process is handled out of the box by AWS Kinesis Firehose. Once we have this setup we can post events to the Kinesis endpoint and Kinesis will store them in S3 for use in a format that Quicksight will understand.

So here is the plan:

(App:sendEvent) --> (KinesisFirehose) --> (S3) --> (Quicksight) 

Setting it all up

Now we could just go and create all this infrastructure in the AWS console. That’s totally fine, but personally I find writing infrastructure as code is a much more reliable and manageable solution. If you haven’t tried Terraform before I highly recommend you do, even for personal projects.

First let’s define a few variables:

variable "name" {}

Next, let’s set up an S3 bucket for our events to be stored in:

resource "aws_s3_bucket" "bucket" {
bucket = "${var.name}-event-stream"
acl = "private"
}

Now we need to create a Kinesis Firehose for delivering the events to that bucket. Notice how the s3 configuration has a role_arn and logging. We’ll come to that next!

resource "aws_kinesis_firehose_delivery_stream" "s3_stream" {
name = "${var.name}-s3-stream"
destination = "s3"
s3_configuration {
role_arn = "${aws_iam_role.firehose_role.arn}"
bucket_arn = "${aws_s3_bucket.bucket.arn}"
cloudwatch_logging_options {
enabled = true
log_group_name = "${aws_cloudwatch_log_group.firehose.name}"
log_stream_name = "${aws_cloudwatch_log_stream.firehose.name}"
}
}
}

Permissions (IAM not kidding)

So now that we have defined our Firehose and S3 bucket in Terraform we need to give each entity the correct permissions for streaming events. Permissions in AWS are managed by IAM roles and IAM policies.

Let’s define a role for our Firehose endpoint. The role is required if we want to customize the permissions that Firehose will have when interacting with other infrastructure. Notice in the prior example how this role was specified as a role_arn. That statement connects the role with our particular Firehose.

resource "aws_iam_role" "firehose_role" {
name = "${var.name}-firehouse-role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "firehose.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}

Now we want to attach some permissions to the role we created. We want Firehose to access our S3 bucket and also to log errors to CloudWatch. Let’s enable that:

resource "aws_cloudwatch_log_group" "firehose" {
name = "${var.name}-firehose"
}
resource "aws_cloudwatch_log_stream" "firehose" {
name = "${var.name}-firehose-log-stream"
log_group_name = "${aws_cloudwatch_log_group.firehose.name}"
}
resource "aws_iam_policy" "firehose" {
name = "${var.name}-firehouse-policy"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement":
[
{
"Effect": "Allow",
"Action": [
"s3:AbortMultipartUpload",
"s3:GetBucketLocation",
"s3:GetObject",
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:PutObject"
],
"Resource": [
"${aws_s3_bucket.bucket.arn}",
"${aws_s3_bucket.bucket.arn}/*"
]
},
{
"Effect": "Allow",
"Action": [
"logs:PutLogEvents"
],
"Resource": [
"${aws_cloudwatch_log_group.firehose.arn}",
"${aws_cloudwatch_log_stream.firehose.arn}"
]
}
]
}
EOF
}

Now, lastly, we need to attach this policy to the role we created. Here goes:

resource "aws_iam_role_policy_attachment" "firehose-attach" {
role = "${aws_iam_role.firehose_role.name}"
policy_arn = "${aws_iam_policy.firehose.arn}"
}

Hey presto — we have the infrastructure we need. Run terraform apply and you should see an S3 bucket and a Kinesis Firehose get created in AWS. If you navigate to the Kinesis dashboard in the AWS Console you can send test events and watch them stream into the bucket we created. Now we just need to replace those test events with the real events from our application.

Sending events to Kinesis Firehose

We can send JSON to our Kinesis endpoint from any application. The JSON can be any format we like.

The great thing about Quicksight is that it doesn’t impose any schema on our events. We can send it any valid JSON and adjust our queries later in Quicksight to reflect the data. The easiest way to get JSON to Kinesis is by using the AWS SDKs in your language of choice, but I’ll show you how I did it in MailSlurp.

As MailSlurp is written in Kotlin and uses events to create new email addresses, send emails, and process incoming emails, I decided to to take an asynchronous approach to event tracking. Here is an example of the code I use:

@Service
class KinesisNotifier {

private val mapper = MsObjectMapper().mapper

private val log = logger()

@Autowired
lateinit var appConfig: AppConfig

@Autowired
lateinit var kinesis: KinesisClient

// this method is invoked whenever a MailSlurp event occurs.
// I dispatch MailSlurp events after different
// actions take place in my app
@Async
@EventListener
fun sendAllEventsToFirehose(event: MsEvent) {
val jsonBytes = mapper.writeValueAsBytes(event)
val data = ByteBuffer.wrap(jsonBytes)
kinesis.client.putRecord(PutRecordRequest()
.withDeliveryStreamName(appConfig.kinesisStreamName)
.withRecord(Record().withData(data))
)
log.info("Kinesis:putRecord ${event.getName()} -> ${appConfig.kinesisStreamName}")
}
}

Putting it all together

In Part 2 I’ll show you different ways to graph the data in Quicksight, but for the hard part is done. To recap, Quicksight lets you build charts based on schema-less JSON data stored in S3. The best way to get started is using Kinesis Firehose to stream incoming JSON into S3 in a Quicksight directory structure. The easiest way to set up the infrastructure for this is using Terraform. Finally, to send events to Kinesis I recommend using the AWS SDK to PutRecords with JSON bodies.

Let me know what you think so far and see you in Part 2 when ready.

Jack Mahoney

Written by

My personal programming and development blog. More at https://dev.jackmahoney.me

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade