TIDYVERSE | DPLYR

R Functions: read_csv()

The best way to import CSV data into R

Josh Gonzales, PhD(c)
Published in
3 min readMar 21, 2020

--

We talk about R functions that will make your data science journey easier.

Having a bunch of data is nice, but the real fun starts when you load that data into a program that can interpret what’s going on. The most common way to get data into R is the read.csv function. However, I suggest you use read_csv instead.

Here’s why, and how to do it.

What’s the difference?

Sometimes in coding, the difference between a dot and an underscore is little more than a coder’s preference. In this case, however, that subtle change means everything.

The read_csv function imports data into R as a tibble, while read.csv imports a regular old R data frame instead.

Tibbles are better than regular data frames because they:

  • load faster
  • don’t change input types
  • allow you to have columns as lists
  • allow non-standard variable names (i.e. your variables can start with a number and can contain spaces)
  • never create row names

There are other nuanced reasons why tibbles are better than classic data frames, but for now all you need to know is that:

  1. read_csv creates a tibble
  2. read.csv creates a regular data frame.
  3. you should load a tibble instead of a data frame if you’re a data scientist with better things to do other than wait for your data to load into R.

How to load read_csv()

Before you can use the read_csv function, you have to load readr, the R package that houses read_csv.

You have two options to do so.

Option 1: Install and load the readr package

If you know you just want to install readr, use:

install.packages("readr")

If you’d like to install the development version from Github instead, then use:

devtools::install_github("tidyverse/readr")

--

--

Josh Gonzales, PhD(c)
R Tutorials

PhD Candidate @ University of Guelph studying representation in entertainment and goal-oriented decision-making