How to use R and Big data in Marketing Research

Malgorzata Mleczko
Nebu
Published in
3 min readFeb 5, 2018

R has gained massive popularity in the past decade as the tool of choice for a wide variety of data analysts. The language is used as part of the data analysis toolchain in some of the biggest companies in the world. The former Revolution Analytics listed up some of the companies that work with R. That list looked impressive already in 2014 including Facebook, Google, Twitter, Monsanto, the FDA, Lloyds, Credit Suisse. Airbnb joined the club in 2016.

There are a few organizations that regularly monitor and publish reports on trends in the data science world. To fully grasp the importance and potential of R in marketing research it makes sense to mention a few key metrics of the most influential industry reports.

Rexer Analytics Survey

TIOBE Index

Main benefits of R language

Statistics and data in the DNA

R allows you to manipulate (e.g., subset, recode, merge) data quickly. Some R packages have been designed specifically for these purposes, e.g., dplyr. Typically, a majority of the time spent on an analysis project is spent on the analysis — preparing the data. R is much adept and efficient in data preparation. Collected data often requires many steps in data processing to be ready for analysis, so R is ideal.

R Community

Anyone (including you) can contribute packages to the community to improve its functionality. The number of R packages contributed to the community is increasing at a rapid rate. Chances are, if there’s an analysis you need to do, you will find R packages to do it.

Data Visualizations

R has advanced graphics capabilities (to see examples go here and here). You can create beautiful graphics using R packages. In general, people like to digest and understand statistics visually, and R provides great tools for achieving exactly this.

Support large datasets

Many tools have restrictions on how large your dataset can be. Processing large datasets, even when it does not technically exceed the maximum size of the tool you’re using, can be a rather slow process (especially after you add tabs, formulas, and references). R supports larger dataset and supports big data.

Reproducibility

R has features that make it much easier to reproduce the findings of your analysis, which is important for detecting errors.

  1. It’s easy to add comments to your scripts to make clear what you’re doing.
  2. Data and analysis are separated in R, allowing you to see the logical progression for data analysis in the R code.
  3. You can use version control to track (and revert) changes you make over time and to share your scripts with others to collaborate on projects as a community.

Automation

R scripting language provides an easy way to automate processes. It can save you loads of time, especially when you plan to re-run the same analysis multiple times (e.g., a project being conducted on a recurring basis).

Click here to Register for R training now .Check out more similar blog posts here

--

--