A tutorial on how to interface an R Notebook with Overleaf

An R notebook is a tool of data analysis and writing where you can integrate literate programming within R. In order to use Rnotebook, you will need to install Rstudio either on a server or your personal computer. Rstudio is an integrated development environment for R. You can use an Rnotebook within Rstudio, an IDE for using R. Besides R, the notebook will also work with Python and Bash and a few other languages. You can weave codes and see outputs in the same document. It makes writing your paper or document intuitive. The following resource is an excellent introduction to Rnotebooks within Rstudio:

Adopting this workflow has several benefits. R is a popular language for data analysis and several authors have written packages that you can use. Rstudio and within Rstudio, Rnotebooks offer you a way to weave codes and outputs within the Rnotebook environment. So, you can write using open source tools an entire paper (through a book with tables, references, citations, figures, and data) without leaving the app. While the App is versatile in what it lets you do, if you want to publish the output or share it with others, you will need to think of other channels. One of the best channels for publishing your work is Overleaf.

Overleaf offers a web-based a LaTeX writing environment that lives on the web and you can use it to push your articles and papers and theses to a number of different publications and mint DOIs. If you know how to write LaTeX codes, then Overleaf provides a powerful solution to text formatting and publishing in one step. But not all of us are proficient in latex and it has a steep learning curve to write in this medium. Overleaf provides a WYSIYG environment (a rich text environment) where you can do most of your work (the text formatting at least). You can learn more about Overleaf here:

A middle ground is where you can write and work on your codes in an R notebook, and publish it through Overleaf. For this to work, you will need to use a version control system. Overleaf provides a git repo for all your work. Git is a version control system, and it helps to learn git for getting your work done. If you use Github, you can maintain version control of your codes and writings. Rstudio offers git built-in within its system.

In Rstudio, you can write in markdown format. A markdown format is a way to write in plain text and format in rich text using the markdown software. Rstudio provides you with the bells and whistles of Pandoc, a universal document converter that you can use to write in markdown and use pandoc to convert it to different formats. You can learn more about pandoc from here:

In this post, I am going to describe a workflow that you can use to integrate the two. The steps are:

  1. First, set up a new project in Rstudio
  2. Within that project, create an R notebook.
  3. Next, set a new project within Overleaf.
  4. Get the git repo URL of your Overleaf project
  5. Now connect the project folder in Rstudio to Overleaf through git
  6. Pull and push resources between Overleaf and Rstudio

This is what you will do to connect Rstudio and Overleaf. The details of what gets into designing this system so that you can then make use of both systems needs explanation. Before you go ahead with the steps, make sure that you have installed Rstudio or have an instance of Rstudio availabel on your server. You will also need a free account on Overleaf. Both Rstudio and Overleaf are free and are available on all platforms. If you can find a hosted instance of Rstudio, then you will not need to install anything on your computer.

Step by Step

Step 1: Set up a new project in Rstudio and R Notebook

Set this up first. In Rstudio, do:

File >> New Project >> New Directory >> Empty Project

Name a directory and a project name and you are all set. 
Then, once you are in Rstudio, click on

New File >> R markdown

This will set up your Rnotebook to work.

Step 2: Configure R notebook to set it up to work with Overleaf

The Rnotebook starts with a preamble. You change the preamble to:


title: Sample title of the paper
bibliography: mybiblio.bib
output:
pdf_document:
keep_tex: true

Note the last two lines. We want to create a pdf document using LaTeX and we would like to keep the tex source, hence we set the

keep_tex:true

Step 3: Set up Overleaf to work with Rstudio

In Overleaf, set up a new project and select a blank paper to start with. It will create a new empty Overleaf project with a main.tex file and nothing else. We will change this.

Overleaf creates a git repository (slang: git repo) for your project. You will find the URL of this git repo from Overleaf by visiting the “Share” link and then “Clone with Git”. We will not clone with git here (you will see why when you start sharing your work with others and I am going to cover this in a future post). Note or copy this URL.

Step 4: Connect Rstudio to Overleaf: set up first

Now switch back to Rstudio, and from “Tools” menu, choose “Shell”. This will open a shell window within Rstudio environment. The shell window will look like as follows:

See the dollar sign? The dollar sign is called “the shell prompt”. Next to the dollar sign, you will issue the following commands in the order I present here:

git init
// do this first. This will initialise a git repository in your folder. Thanks to Patrick for this correction!
git remote add origin <the URL you copied from Overleaf>

The above code first calls “git”, and then asks git to connect to another repository (git calls it “remote repository”) other than the current one. The keyword “add” tells git to add that repository to its known addresses. The keyword “origin” tells git that this remote repository will be referred to as “origin”. You test it by issuing the following command next:


git remote -v

The output above tells you your address to which you will “push” your contents and from where you will retrieve your contents. You will only need to do it once, and after you set it up, things will continue to work.

Step 5: pull first, then add, commit, and push

Note that you have already created some content on Overleaf and they are still lying there. If you collaborate with other people on Overleaf, it is also possible that they have changed a file or they have added or deleted something there. Hence you will always first need to fetch files from the branch on the remote git server. To do this, you issue the following command in the shell prompt:


git pull origin master


This pulls the files from Overleaf folder to your local folder. You will see that your local Rstudio folder gets populated with files.

Step 6: Add, commit, and push

Now you can continue to write your paper. Write your paper using markdown format and use chunks to run your data analysis. Next to the chunks you can see the outputs of your work. When you are ready to send the work over to Overleaf, in Rstudio you use the “Knit” button to first knit your document to LaTeX. Then you will need to do the following in the order I present:


git add all

It is NOT a good idea to do git add all but in this case, you may have created graphs, and tables, and outputs, and may have added a bunch of bibtex entries. I find it handy to use add all files because it helps me not to leave anything out. If you do not want to push something to your remote repository, you can add them to the .gitignore file. Read a git manual to learn more about these things. But if you follow what I write here, you can get started. As you have added files to a staging area, it is called “staging” of files.

Next, you will commit those changes with a message and push them to the remote git repository. Commit those changes first:


git commit -m “add a message stating what got changed”

Here you tell git that you have committed and add a message (hence, “-m”) and write a meaningful message.

Then, finally, you push the changes over to Overleaf. Do:


git push -u origin master

In which you tell git to push the changes to the master branch of the origin (your Overleaf repo). You will see that the programme will generate several lines of code and if everything goes well, it will let you know that your master branch got at par with the remote master branch.

Step 7: In Overleaf, change set your Rstudio filename as the main file that Overleaf will compile

You have to do this only once. Afterwards, Overleaf will remember this choice and you will see your document compiled in Overleaf (unless Overleaf’s LaTeX processor throws some errors or warnings that you may need to fix to get the correct result).

Conclusion

It may appear that a seven-step process is tedious but what I wrote here is for the first time you use Rstudio and Overleaf, you will need these seven steps. After the first pass, all you have to do is to:

  1. Write your content,
  2. knit it to LaTeX, and do:

git pull origin master
git add .
git commit -m “message”
git push -u origin master

Both Overleaf and Rstudio are powerful tools for your research and writing. In this episode I covered connecting the two tools. In the next episodes, I will cover how you can use Rstudio and Rnotebook to write your paper and do data analysis, and generate graphs and tables. I will also cover how then you can fine tune the writing in Overleaf and publish and distribute your work further with Overleaf.