Playing with Databases in R

Databases are a common requirement for software development and can be accessed with help of many programming languages like R, Java, Python, JavaScript, and many more. Here, we would be specific with R.

Why R?

According to the 2107 Burtch Works Survey, 40% prefer R, 34% prefer SAS, and 26% Python out of all surveyed data scientists. According to KDNuggets’ 18th annual poll of data science software usage, R is the second most popular language in data science. This shows how popular R programming is.

How to get connected with databases in R?

R has inbuilt packages for each database, here are a few examples,

  • RPostgreSQL
  • RSQLite
  • RODBC
  • RMySQL
  • RMariaDB
  • and many more.

R also has a Database Interface package for communication between R and RDBMS named ‘DBI’. DBI makes all the API calls from all the database packages. A simple calling format image is shared below.

Basic Functions in DBI

The details and usage of each function have been discussed below with the code.

Let’s Connect…

Installing the packages required to connect PostgreSQL.

#install using R terminal
install.packages("DBI")
install.packages("RPostgreSQL")

NOTE: All further codes are worked in the R Notebook.

Importing/Calling the packages

library("DBI")
library("RPostgreSQL")

Making a connection with the database by initializing the driver

pSQLDriver <- dbDriver(drvName = "PostgreSQL")
database <- "demo"
host <- "localhost"
port <- 5432 #predefined port no is 5432 (for PostgreSQL), can be changed accordingly
user <- "postgres"
password <- "root"
conn <- dbConnect(pSQLDriver, dbname = database, host = host, port = port, user = user, password = password)
PostgreSQL Dashboard

Creating a table

#This will give output as TRUE or FALSE to denote whether the table was created or not
#Syntax: dbWriteTable(connection, name of the table, data)
dbWriteTable(conn, "iris", iris) #iris is a predefined dataset in R
Iris Data in PostgreSQL Dashboard

Listing all the tables

#We use dbListTables() to list the tables
#Syntax: dbListTables(connection)
dbListTables(conn)

Listing all the fields of a particular table

# We use dbListFields() to list all the fields or column of a table
# Syntax: dbListFields(connection, name of the table)
dbListFields(conn, "iris")

To read data from the table

#We use dbReadTable() to read/select all the data
# Syntax: dbReadTable(connection, name of the table)
dbReadTable(conn, "iris")

Working with SQL queries

#to use a SQL query
#syntax: var <- dbSendQuery(connection, statement) this will return query object
# dbFetch(var)
res <- dbSendQuery(conn, "SELECT * FROM iris")
dbFetch(res)

Ending the fetched query

#Always remember to end the fetched query so that there is no clash between different queries while working with multiple databases.
#Syntax: dbClearResult(var)
dbClearResult(res)

Disconnecting the driver

#Always remember to disconnect the driver after you're done working with the database
#Syntax: dbDisconnect(connection)
dbDisconnect(conn)
Code in a glimpse

Conclusion

So, these were the basic steps to connect PostgreSQL with R and this could lead to effective data management. With good data management, R could be even more productive for data-related fields like Data Science/ Analysis/ Visualization, etc.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gagan Chordia

Gagan Chordia

29 Followers

Frontend Developer | R & Java Programmer | PHP Developer | UI/UX Designer