Day 6: Realtime Tweets Analysis using Spark Streaming with Scala
Project (1 Hour): Create a twitter app and use its API to Stream realtime twitter feed using Spark Streaming with scala.
All the code for this project can be found on my github
Step 1: Download and Setup Spark and Scala IDE
Ensure you have JDK already setup, verify it using the below command, if not go ahead download and setup your JAVA_HOME environment variable.
$ java -version
java version “1.8.0_91”
$ echo JAVA_HOME
Download Spark from: http://spark.apache.org/downloads.html
Run a test scala code from the downloaded directory using Spark Shell.
scala> sc.parallelize(1 to 1000).count()
res1: Long = 1000
You can also test using the example python code.
Finally install scala IDE built on top of eclipse from: http://scala-ide.org/download/sdk.html
Step 2: Create the project with Twitter App credentials
Create a twitter app using https://api.twitter.com/ and then fill in the following in a text file.
Setup a scala project in IDE and create the following scala code that prints out live tweets as they stream using Spark Streaming.
Building and running the above should continuous stream tweets to your console. English doesn’t seem to be the popular language at this hour!
Day 6 of #100DaysOfCode DONE
If you enjoyed this, please click 👏 so that others can enjoy it as well. Follow me on Twitter @HariniLabs to get the latest updates or just to say Hi :)