R function of the week: Getting more out of the read.table() function

Statistics Without Borders
5 min readMar 29, 2022

--

Author: Neha Anwer, Statistics Without Borders

I was surprised to find that read.table() did not make the cut. I use this function practically every time I launch a new R script!

This Article Contains

  • A brief explanation of the read.table() function and a typical use case
  • Using read.table() to copy and paste data into R
  • A working R code example
  • Best practices on using this functionality

Brief Explanation of read.table()

This week at Statistics Without Borders, we share a twist on the frequently used R function read.table(). This function is available in base R, which means that you don’t have to install any additional R packages to use this function!

I recently found an article on Towards Data Science titled “Top 100 most used R functions on GitHub,” where the author conducted some exploratory analysis on the most popular R functions on Github. I was surprised to find that read.table() did not make the cut. I use this function practically every time I launch a new R script!

Typical Use Case

The official definition of the read.table() function is:

Reads a file in a table format and saves data as a dataframe.

Let’s say that we want to read a .csv file into R, called filename.csv, located in the filedir folder.

Typical usage of the function looks something like:

df <- read.table('filedir/filename.csv', sep = ',', header = TRUE)

The result of this command would be a dataframe titled df which contains the data from filename.csv in a tabular (or table-like) format.

Use read.table() to copy and paste data into R

While this in itself is helpful, the real power of read.table() comes into play by putting in “clipboard” instead of a file extension. This allows you to literally copy and paste data from any file source directly into an R object. The revised command would look something like this:

df_2 <- read.table('clipboard', sep = '\t', header = FALSE)

Let’s say that I had copied some data from an Excel spreadsheet or Google Sheet. This command would then create an R object called df_2 that contains everything that I had copied. You can think of that command as pasting what I had copied into R!

When I first learned about this, I was genuinely mind blown thinking about all the temporary files I had created while trying to quickly process some test data.

That being said, the standard rules of tabular data still apply. This trick won’t accomplish tasks like copying and pasting an entire paragraph of text into a tokenized R dataframe.

However it would work perfectly on a data structure like the one below. The World Population Table below can be copied and pasted directly into R as a dataframe.

World Population (as of 3/28/2022)|7,933,659,043

Next UN Estimate (July 1, 2022)|7,953,952,567

Births per Day|382,470

Deaths per Day|166,308

Net Change per Day|216,162

Population Change Since Jan. 1|18,806,094

Step 1: Highlight the text

Step 2: Copy the text by right-clicking and selecting “Copy,” or using Ctrl+C (Command+C if you’re on a Mac)

Step 3: Run the following code:

# Note that the separator here is a pipe character.
# And there are no headers.
my_data <- read.table('clipboard', sep = '|', header = FALSE)
World Population Table as an R dataframe

Best Practices

If you’re like me and are keen to use this trick regularly, below are a few best practices to help you avoid the hair pulling mistakes I made:

  1. When copying and pasting, it’s always best to remove formatting. This becomes especially important when copying things over from excel. Make sure to remove % signs so numbers are copied over in numeric format. Sometimes I prefer to create a secondary excel data range by copying and pasting values only from my original data.
  2. This one is obvious but a good reminder: The data only exists in your clipboard! If you copy something else or turn off your machine, the data will be lost. It will not be available in R the next time you resume a session unless you explicitly save it somewhere. If whatever you copied and pasted ends up being important, my suggestion would be to output the R dataframe to a file that you can access in case you need to come back to it.
  3. I’ve found this functionality to be most helpful when I’m working with large, messy files with a mix of free text and multiple tables. It’s much easier to copy paste the relevant tables into R and save them separately than to try to process the original file if the rest of the content is irrelevant.

With these best practices in mind, go forth and conquer! I hope this helps you out in a jam as it has me. If you are interested in finding out what the most popular R functions are, below is the resulting graph from Towards Data Science.

Top 100 R Functions

Meet the Author

Neha Anwer

Neha is a data science professional with domain experience in the financial advisory field. She has been a volunteer at Statistics Without Borders for over a year and most recently joined the SWB Marketing & Communications team. In her spare time, Neha enjoys traveling, curling up with a fiction read, and spending time outdoors.

--

--

Statistics Without Borders

Statistics Without Borders (SWB) is an apolitical probono organization under the auspices of the American Statistical Association.