Using Streaming and ML on AWS to Track Sentiments on Twitter

Rafael Campana
BRLink
Published in
3 min readMay 24, 2020

--

This will lead you to use AWS comprehend and Kinesis Firehose to ingest and analyse sentiments from Social Media

*This article is based on this but, I’ve tried to simplify the architecture.

Why analyse social media engaging and sentiments? Let's assume that you work for a global brand and need to track how the people feel about news marketing actions media. Would you like to track how your feel about your brand without manual job? Or be alert if some actions go wrong?

Ok, let's get our hands dirty:

1. Architecture

1.1 Get_twitter lambda: get twitter api data and send to firehose

1.2 Firehose: Buffer data according config and send to lambda to add info about sentiments. After that save to S3

1.3 Analyse sentiments lambda: Will receive a bulk of records from firehose, send to aws comprehend and aggregate and send back to firehose

1.4 Data Catalog and Athena: Glue crawler will catalog data and create tables that can be accessed by Athena

1.5 Quicksight: Create dashboard from Athena queries

2. Create a Lambda Twitter Producer

This lambda will get data from twitter api and send to firehose.

Optional: Setting up this lambda with cloudwatch event trigger , to generate data in a specific time. Ex: will call this lambda every minute until I disable.

The source code of lambda:

If you want to use online lambda editor you’ll need to create a layer for the module “requests”. here I explain how to create a lambda layer

3. Create lambda sentiment analyser

This lambda will receive bulk of records from Firehose, add sentiment analysis and send back to firehose.

Source code:

Don’t forget to setup the IAM Role with access to AWS Comprehend

4. Setting up Firehose

Now we need to setup the firehose to store data according our needs, create a new firehose

Setup your firehose with data transformation and configure to send data to your lambda

Prefix: raw/year=!{timestamp:YYYY}/month=!{timestamp:MM}/day=!{timestamp:dd}/hour=!{timestamp:HH}/

Error prefix: fherroroutputbase/!{firehose:random-string}/!{firehose:error-output-type}/!{timestamp:yyyy/MM/dd}/

5. Create Glue Crawler and Athena Query

Run the crawler and a new table will be created and you can query your data on Athena

6. Quicksight Dashboard

Create a datasource and set to your previusly created table on Athena and create a new Analysis and play with your data:

--

--

Rafael Campana
BRLink

Tech Manager @ BRLink — AWS Certified Developer & Big Data