How to collect data like a spy — Part 6

Creating Maps of the Data in RStudio

In RStudio we need to ensure all the required libraries are installed as follows.

install.packages(“rJava”)
install.packages(“RJDBC”)
install.packages(“plyr”)
install.packages(“dplyr”)
install.packages(“png”)
install.packages(“RgoogleMaps”)
install.packages(“ggmap”)
install.packages(“maps”)

We then need to load the libraries.

library(rJava)
library(RJDBC)
library(plyr)
library(dplyr)
library(png)
library(RgoogleMaps)
library(ggmap)
library(maps)

We now connect to the Athena database.

URL <- ‘https://s3.amazonaws.com/athena-downloads/drivers/AthenaJDBC41-1.1.0.jar'
fil <- basename(URL)
if (!file.exists(fil)) download.file(URL, fil)
drv <- JDBC(driverClass=”com.amazonaws.athena.jdbc.AthenaDriver”, fil, identifier.quote=”’”)
con <- jdbcConnection <- dbConnect(drv, ‘jdbc:awsathena://athena.eu-west-1.amazonaws.com:443/’,
 s3_staging_dir=”s3://athena-nifi-bucket”,
 user=Sys.getenv(“ATHENA_USER”),
 password=Sys.getenv(“ATHENA_PASSWORD”))

Now we can retrieve the data from the database.

sd=dbGetQuery(con, “with locations AS (select twitter.geo.geo.coordinates from twitter.geo) select coordinates[1], coordinates[2] from locations”)

Lets rename the database columns.

names(sd)[names(sd)==”_col0"]=”lat”
names(sd)[names(sd)==”_col1"]=”long”

Lets create the map.

map=qmap(‘UK’,zoom=3)
map + geom_point(data = sd, aes(x = sd$long, y = sd$lat), color=”blue”, size=0.5, alpha=0.5)

If all is good, it should look something like this. You can see a few blue dots of the tweet locations.

The Series

Part One — How to collect data like a spy
Part Two — Getting NiFi up and running
Part Three — How to collect social media data like a pro
Part Four — Creating a database with AWS Athena
Part Five — Connecting RStudio to Athena
Part Six — Creating Maps of the Data in RStudio
Part Seven — Creating an interactive dashboard for your data

Show your support

Clapping shows how much you appreciated Mark Craddock’s story.