TIDYVERSE | DPLYR

R Functions: read_csv()

The best way to import CSV data into R

We talk about R functions that will make your data science journey easier.

Having a bunch of data is nice, but the real fun starts when you load that data into a program that can interpret what’s going on. The most common way to get data into R is the read.csv function. However, I suggest you use read_csv instead.

Here’s why, and how to do it.

What’s the difference?

Sometimes in coding, the difference between a dot and an underscore is little more than a coder’s preference. In this case, however, that subtle change means everything.

The read_csv function imports data into R as a tibble, while read.csv imports a regular old R data frame instead.

Tibbles are better than regular data frames because they:

  • load faster
  • don’t change input types
  • allow you to have columns as lists
  • allow non-standard variable names (i.e. your variables can start with a number and can contain spaces)
  • never create row names

There are other nuanced reasons why tibbles are better than classic data frames, but for now all you need to know is that:

  1. read_csv creates a tibble
  2. read.csv creates a regular data frame.
  3. you should load a tibble instead of a data frame if you’re a data scientist with better things to do other than wait for your data to load into R.

How to load read_csv()

Before you can use the read_csv function, you have to load readr, the R package that houses read_csv.

You have two options to do so.

If you know you just want to install readr, use:

install.packages("readr")

If you’d like to install the development version from Github instead, then use:

devtools::install_github("tidyverse/readr")

Then, load readr using:

library(tidyverse)

Installing readr by itself can be beneficial in some specific cases. But if you know you’re going to use more than just readr from the tidyverse world — which, if you’re reading this, probably holds true — you can install the whole tidyverse package using:

install.packages("tidyverse")
or
devtools::install_github("tidyverse")

Doing so allows you to load readr through:

library(tidyverse)
or
library(readr)

How to use read_csv()

Now that you have readr loaded into R, you can use read_csv to import data for analysis.

To do so, all you need to do is go to your working directory and use:

read_csv("CSV file name.csv")

Of course, typically you’ll want to load the CSV into a variable when using R so you can refer to it whenever that dataset is needed. All that takes is:

variable <- read_csv("CSV file name.csv")

Voila. Now your variable holds a tibble with all your CSV data inside. It’s a straightforward process and one you should become intimately familiar with if you use R regularly.

Always remember: having data is great, but getting that data ready for analysis is the key. The read_csv function is one of the quickest and most efficient ways to do that.

R Tutorials

Everything you need to know about R

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store