Git for Economists: A Practical Guide

Integrating already existing projects

Arieda Muço
CodeX
11 min readSep 14, 2024

--

Imagine from Overleaf’s official webage

As researchers, we often find ourselves collaborating with others who have strong preferences about the tools we use to manage research projects. Some swear by Dropbox, others by Git. Among Git users, some prefer the command line over a graphical interface (UI).

The command line provides more direct access to Git’s functionalities, allowing users to automate tasks, handle large projects more efficiently, and troubleshoot issues better than most graphical tools. (Though the visual component of the UI is very appealing. My UI of choice used to be SourceTree.)

Recently, researchers around the world have embraced another tool for collaboratively writing academic papers: Overleaf. The plot below shows the global interest in Overleaf over time, based on Google search trends. The trend is clear: there has been a steady increase in searches, reflecting its growing popularity. Researchers rely on Overleaf for its ease of use in real-time collaboration, as it allows multiple authors to write (almost simultaneously) while tracking changes with each contributor’s name, making it easier to manage input from different collaborators.

Typically, Overleaf users are also Dropbox users and rarely advocates of Git in any form. There’s a lot of practicality in both Dropbox and Overleaf. However, I believe that the conviction some colleagues hold — that the Dropbox-Overleaf combo covers all our needs as researchers — may prevent the adoption of more productive tools.

Who is right? If the ultimate goal is seamless, productive research, does it really matter what tools we choose?

I’ve been on all sides — including a Git UI user before becoming a hardcore command-liner. Had I not already been equipped with Git knowledge before the rise of Overleaf, I might have easily become one of its champions. (I started using Git around 2016, marked with the red vertical in the figure above

However, some tools improve the workflow despite the initial cost of learning them, and Git is one of those tools. While Git has a steeper learning curve, it effectively manages many of the challenges that arise in research projects: tracking named changes, resolving conflicts, and enabling seamless collaboration — especially in large, complex projects.

I’m well aware that preferences, and the choice of tools, can sometimes lead to friction. But, I also pride myself on knowing when to push for the adoption of new technologies and when to cede to what my coauthors prefer.

This article is for those who are ready to give Git a try — the command-line version, specifically — and want to combine it with Overleaf, particularly if your project is already up and running. I assume you’re working on a project that already has certain folders and subfolders and includes code, data, and outputs.

Note: If you’re looking for an introduction to the command line (Shell) for Economists have a look at this material we wrote a few years back. Frank Pinter’s notes offer a great introduction to the reasoning behind using Git. For a more detailed guide on Git, I recommend Jesús Fernández-Villaverde’s notes.

Setting Up Git on Your Local Machine

Before we dive into the integration, let’s make sure Git is properly set up on your local machine.

If Git is not installed on your machine, you should download it from here.

After the installation is complete, go to the Shell, command line, or terminal, and type in the following to configure Git with your name and email.

git config --global user.name "Your Name"
git config --global user.email "youremail@example.com"

Typing gitimplies that you are interacting with Git. In my case, the commands will look like:

git config --global user.name "Arieda Muço"
git config --global user.email "arieda.muco@gmail.com"

Assuming that you already have a project that is up and running, called My-project with folders like code, data figures, tables,… I would proceed in the following way:

Open the Command line and change the directory to the folder you’re working on. For example, if your project is located inDocumentsyou will have to type something like:

cd /Users/yourusername/Documents/My-project

By typingcd you’re asking your machine to change the directory. Make sure to replace "yourusername" with your actual username. If your file paths contain spaces, remember to use quotes around the path, like this:

cd "/Users/yourusername/Documents/My Project"

on a Windows machine, it will look something like this:

cd C:\Users\yourusername\Documents\My-project

The command for changing the directory to your project folder will vary depending on whether you’re using macOS/Linux or Windows. Regardless of your system, I recommend using forward slashes (/) in your paths, as Git handles them consistently across platforms.

I also strongly recommend avoiding spaces in file names because they complicate command-line usage and may cause issues with certain systems.

Integrating Git with Overleaf

When integrating Git with Overleaf, there are a couple of different ways to approach the setup, depending on your project’s structure and collaboration needs. Below, I’ll walk you through two common scenarios based on how you want to manage your files and the level of integration you prefer. Both scenarios will help streamline your workflow, but they serve slightly different purposes.

Scenario 1: Cloning your Overleaf project as a subfolder in your already existing project

In this scenario, I’ll explore a more straightforward but less flexible approach: cloning your Overleaf project as a subfolder within your existing local project. This method is useful if you want to keep your Overleaf files separate from the rest of your project, but still part of the larger project, or if you’re just getting started with Git and want to link Overleaf without restructuring your project.

However, this setup limits version control to the files within the cloned Overleaf folder, which might not be ideal for managing larger, more complex projects with multiple components (like code, figures, and tables). It’s a good starting point for simpler projects, but for more advanced users, the next scenario will provide better integration.

Since I assume that your Overleaf project is already up and running, you’ll need to navigate to the Overleaf project and go to the project settings. Click “Git” and find the repository URL. This URL will allow you to link your local Git repository with Overleaf.

Here’s how it will look in my case:

At this point, you should type in the terminal:

git clone https://git.overleaf.com/overleaf-project-id my-preferred-name

In my case it will be:

git clone https://git.overleaf.com/66e57b4c9d90d977495f167d Latex-files

It’s important to add your preferred name rather than your assigned project ID. The reason is that Overleaf assigns unique, random, IDs to each project, and when cloning, it simply uses that ID by default. The ID will look something like 66e57b4c9d90d977495f167d.

However, there’s no functional difference if you rename it to something more readable! I don’t like for my project to be called 66e57b4c9d90d977495f167d as it’s not user-friendly. You can still change the name afterward.

Cloning will request a password

Here you will have to create a Git token for authentication. A Git token is a more secure way of connecting Overleaf to Git. To generate a token, follow these steps:
→ Click the green button “Go to settings” as in the image above
→ Go to ‘Git Integration’ Settings and click to create Git Token.
→ Copy the generated token, as it will only be shown once.
Now, when prompted for a password during the Git cloning process, paste this token instead.

This process will create a local copy of your project, in this case in your download files, and it will have the name of your project files locally.

Now you can use git add ., git push, git pull

Note that if you go this approach, the tracking is limited to everything that is inside my-preferred-name folder.

Making Changes Locally
After cloning, you can add or update files (e.g., figures, tables) in your local directory. Once you’ve made changes, here are some essential Git commands:

Add files: Stage all changes using git add .(this stages all changes in your directory) or specify particular files git add figure1.png

Commit changes: After staging, commit your changes with a descriptive message:

git commit -m "Figure 1 but with residualized output, excluding GDP"

Commit messages are very important as they allow us to recall the changes we did at any step, and in case we need to go backward in time, we can do so by pinpointing the important commit and the commit number associated with it.

Push changes to Overleaf: After committing, push the changes back to Overleaf using:git push

In this case, your collaborators will have the same version of the figure, including the code used, and changes, to produce it.

Pushing makes sure that all your collaborators have the same version of the files.

In case you would like to get your collaborator’s changes you should type git pull instead.

It’s always a good practice to always pull before attempting new changes otherwise conflicts may arise.

Scenario two: Integrating your full project with Overleaf

If you want to track the entire project — something I strongly recommend — this second scenario is the better approach. It focuses on fully integrating your local project with Overleaf.

This method allows you to track changes across the entire project, including yourcode, data figures, tables, etc. — not just the Overleaf files. By synchronizing everything into one Git repository, you ensure that every part of your research project is version-controlled and accessible to your collaborators.

This approach is more efficient and scalable, especially for complex or data-heavy projects where code and results are frequently updated. It also simplifies collaboration by keeping all components of the project in sync across platforms.

Step 1: Initialize Your Local Git Repository
If you haven’t already initialized Git in your local project folder, follow the steps below. If Git is already set up, skip to Step 2.

Navigate to your local project directory and type:

cd /Users/ariedamuco/Documents/My-project

Initialize Git

git init

This creates a new Git repository in your project folder. Once initialized, you can start tracking your files.

Stage all files to Git:

git add .

Here I recommend not adding data, but only code and the output relevant to the paper (figures and tables). In that case, the command would be:

git add code/ figures/ tables/
git commit -m "Initial commit with code, figures and tables"

Step 2: Link Local Repository to Overleaf
You need to link your local repository to the Overleaf Git repository as a remote if it’s not already connected.

Get the Overleaf Git URL from your Overleaf project settings (it will look like https://git.overleaf.com/your-project-id.

Add Overleaf as a remote:

git remote add overleaf https://git.overleaf.com/your-project-id

The project ID is the same as before. In my case, it is 66e57b4c9d90d977495f167d, so the full command would look like this:

git remote add overleaf https://git.overleaf.com/66e57b4c9d90d977495f167d

Before you push any new files (like code, figures, or tables), it’s important to check the status of your repository and pull any existing changes from Overleaf to avoid conflicts. This ensures you’re working with the latest version of the project.

First, check the status of your files:

git status

Then, if Git does not prompt you to commit or stage any changes, you can proceed and pull the latest updates:

git pull overleaf master

These steps are crucial if collaborators have made changes directly in Overleaf, as they ensure your local repository is up-to-date.

  • If there are no conflicts, the changes will merge seamlessly with your local project.
  • If there are conflicts (e.g., if both Overleaf and your local repo have changes in the same file), Git will prompt you to resolve them. In those cases, I recommend using tools like ChatGPT to guide you through the conflict resolution process.

When you’re good to go, you can push your changes, so that your collaborators have access to them, by typing:

git push overleaf master

Note that with this second approach, everything — including your code and output — will be loaded into your Overleaf workspace. However, you should use a .gitignore file to specify which files you don't want to upload to the cloud (Overleaf project). Typically, the data folder is the first thing we add to the .gitignore file, as you usually only need the code to transform the data.

Elaborating on “git conflict” and “.gitignore”

If you work with Git, especially if you work in a collaborative project you will need to handle conflict… Git conflict.

Conflict Resolution:

You suggest using ChatGPT for help with resolving Git conflicts, which is great. However, you might also briefly mention that Git has built-in conflict resolution tools (e.g., git mergetoolor manually editing conflicted files). You could explain how to recognize and manage conflicts in a little more detail.

Ignoring files with .gitignore

The .gitignore file is a useful and essential part of working with Git, especially when dealing with large projects that contain many files not meant to be tracked by version control.

Using .gitignore to Keep Your Repository Clean

One of the most powerful tools in Git for managing complex projects is the .gitignore file. This file tells Git which files or directories to ignore, preventing them from being added to the repository. This is particularly useful when you’re dealing with large datasets, intermediate files, or sensitive information that shouldn’t be shared or versioned.

A typical .gitignore file might include entries like:

# Ignore data files
data/

# Ignore compiled LaTeX files
*.aux
*.log
*.pdf

# Ignore system-specific files
.DS_Store

Once the .gitignore file is set up, Git will automatically ignore the specified files or directories. It’s important to note that if files were already tracked by Git, adding them to .gitignore will not automatically remove them from the repository—you’ll need to untrack them manually using:

git rm --cached filename

Best Practices for .gitignore

  • Customize it for your project: Every project is different, so tailor your .gitignore file based on the types of files you generate and share.
  • Add .gitignore early: It’s best to add a .gitignore file at the beginning of your project to avoid accidentally tracking files that should be excluded.

Also, make sure everyone on your team is aware of what’s being ignored and why, so there’s no confusion about missing files in the repository.

Conclusion

I like Overleaf for its ease of use, collaborative features, and built-in change tracking, but I strongly recommend investing some time to learn Git as well. The benefits far outweigh the initial learning curve, and you don’t need to master everything to get started. In fact, with just a few essential commands, you’ll be well on your way:

  • git clone → To create a local copy of an Overleaf project.
  • git add . → To stage all changes in your working directory.
  • git commit -m "your message" → To commit your changes with a descriptive message.
  • git push → To send your changes to the cloud (e.g., GitHub or Overleaf).
  • git pull → To update your local copy with any changes from collaborators.

Once you feel comfortable with these basics, you can rely on tools like ChatGPT to help you navigate more complex Git operations as needed. Trust me — your workflow will thank you!

References

[1] Jesús Fernández-Villaverde. “Chapter HPC 5: Git.” University of Pennsylvania. Available at: https://www.sas.upenn.edu/~jesusfv/Chapter_HPC_5_Git.pdf

[2] Frank Pinter. “Introduction to Git.” Available at: https://www.frankpinter.com/git/

[3] Data Carpentry for Economics. “The Unix Shell for Economists.” Available at: https://datacarpentry.org/shell-economics/

[4 ] CEU Threads. “Coding-What-Not-to-Do: Best Practices for Data Scraping and Conversion.” Medium. Available at: https://medium.com/ceu-economic-threads/coding-what-not-to-do-best-practices-for-data-scraping-and-conversion-e7a07a7bddf8

Notes

Unless otherwise noted, all images are by the author.

Thank you for taking the time to read about my thoughts. If you enjoyed the article feel free to reach out at arieda.muco@gmail.com or on Twitter, Linkedin, or Instagram. Feel free to also share this article with others.

--

--

Arieda Muço
CodeX

Econ Ph.D., Researcher, Coding, ML and NLP Instructor