R libraries to aid you to learn data science in 2018

2018 is already here!What a year 2017 has been! For someone who started learning data science later this year, it feels like the year has been short.The R learning curve may seem steep however continuous exposure to different tools and libraries/packages can make your experience simpler.

In this article, I share with you R packages under different branches of data science that have made my learning journey worthwhile so far.

Data Visualisation

This is a very instrumental part of data science, for a data science newbie the ability to create great visualisations gives you the hope that you are on the right track.With great data visualisations comes a sense of appreciation for your work especially from none data scientists.

The following packages will come in handy while visualising in R.

  1. ggplot2 
    This is an R package that a makes all that work of visualisation much easy. It is known as the grammar of graphics and will take care of plotting details, has different graphical options and does great graph layering.
    It is available on CRAN.


Here is a great ggplot2 cheat sheet to get you started: ggplot2 cheat sheet

2. shiny 
This an R package that gives users the power to explore dashboards and web apps.Shiny helps a lot with data collection and manipulation in real time as it handles reactivity in a great way.shiny apps can make use of HTML widgets, CSS themes and javascript actions to interface with R scripts.It is an awesome library for someone interested in data storytelling on their website.
shiny is available on CRAN.


Data Wrangling

One of the goals of every data scientist should be maximising the data analysis time.To achieve this one needs to ensure the data they are working with is as clean as possible and can be subjected to manipulation easily.Data wrangling is the process of cleaning up data, removing redundancy and organising it in a way that makes analysis much easier.The following packages are great and simple data wrangling tools.

  1. tidyr
    From the tidyr website ,tidy data is defined as data where 
    - Each variable is in a column.
    - Each observation is a row.
    - Each value is a cell. 
    tidyr makes use of simple verbs as R functions like gather()to carry out quick data tidying operations on large datasets.
    tidyr is available on CRAN.


2. dplyr
While dealing with data, there are common manipulations that have to be carried out and dplyr helps solve these by providing verb functions to carry out these manipulations.This helps you filter your data and carry out operations that can group the data for deeper meaning.
dplyr is s available on CRAN


Data Mining

This is one of the biggest challenges for data science newbies.Although very many websites are full of open data sets and are free, It is also an accomplishing feeling for a data science newbie to learn how to extract a data set from the numerous sources of information on and off the web.
The following libraries will do the magic:

  1. httr

This package will enable you access data via modern web APIs. It makes use of HTTP verb functions, requests return JSON data that can be parsed as R objects and it supports Oauth. This makes it easy for a newbie working with APIs in R.
This package is available on CRAN


2. rvest 
An R package for web scraping. It reads HTML docs through URLs, selects parts of the document using the CSS selectors and parses HTML tables as data frames in R.
This package is available on CRAN



The first days of learning data science can be quite confusing ,however focusing on each one of these branches can help you understand data science step by step.
I wish you a great learning experience in 2018 .Don’t stop learning.
Feel free to reach out to me via twitter @lornamariak .

I am happy to help and give some hype or support.Happy coding!

Like what you read? Give Lorna Maria A a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.