Using Streaming and ML on AWS to Track Sentiments on Twitter
This will lead you to use AWS comprehend and Kinesis Firehose to ingest and analyse sentiments from Social Media
*This article is based on this but, I’ve tried to simplify the architecture.
Why analyse social media engaging and sentiments? Let's assume that you work for a global brand and need to track how the people feel about news marketing actions media. Would you like to track how your feel about your brand without manual job? Or be alert if some actions go wrong?
Ok, let's get our hands dirty:
1. Architecture
1.1 Get_twitter lambda: get twitter api data and send to firehose
1.2 Firehose: Buffer data according config and send to lambda to add info about sentiments. After that save to S3
1.3 Analyse sentiments lambda: Will receive a bulk of records from firehose, send to aws comprehend and aggregate and send back to firehose
1.4 Data Catalog and Athena: Glue crawler will catalog data and create tables that can be accessed by Athena
1.5 Quicksight: Create dashboard from Athena queries
2. Create a Lambda Twitter Producer
This lambda will get data from twitter api and send to firehose.
Optional: Setting up this lambda with cloudwatch event trigger , to generate data in a specific time. Ex: will call this lambda every minute until I disable.
The source code of lambda:
If you want to use online lambda editor you’ll need to create a layer for the module “requests”. here I explain how to create a lambda layer
3. Create lambda sentiment analyser
This lambda will receive bulk of records from Firehose, add sentiment analysis and send back to firehose.
Source code:
Don’t forget to setup the IAM Role with access to AWS Comprehend
4. Setting up Firehose
Now we need to setup the firehose to store data according our needs, create a new firehose
Setup your firehose with data transformation and configure to send data to your lambda
Prefix: raw/year=!{timestamp:YYYY}/month=!{timestamp:MM}/day=!{timestamp:dd}/hour=!{timestamp:HH}/
Error prefix: fherroroutputbase/!{firehose:random-string}/!{firehose:error-output-type}/!{timestamp:yyyy/MM/dd}/
5. Create Glue Crawler and Athena Query
Run the crawler and a new table will be created and you can query your data on Athena
6. Quicksight Dashboard
Create a datasource and set to your previusly created table on Athena and create a new Analysis and play with your data: