Using fmriprep for fMRI data preprocessing

fMRI data preprocessing is an important step in fMRI data analysis. If you want an introduction to some of the tools used for this, you have come to the right place.

11 min readDec 21, 2019

In this tutorial you will:

go through installation steps of docker and fmriprep, two useful tools for fMRI data analysis;
explore how fmriprep can be used on fMRI data and make sense of its output.

You will use command-line commands all of which are provided step-by-step. Please note that the focus of this tutorial is on macOS machines, but the tools used are also compatible with other operating systems, and we will direct you to relevant resources when appropriate.

Although it might seem like a short post, the two steps outlined take many (think at least 8) hours to complete. If you just want to follow along without running anything yourself, you can find the referenced output here.

Prerequisites: Some experience with Python (you should have it installed, but you do need to be an expert). If you have limited/no experience with fMRI data and never used fmriprep before, then you are the ideal person for this tutorial. We do assume some background knowledge on fMRI — please see this Nature article for some fMRI basics if you need an introduction.

Part 0: Dataset & Objectives

In this two-part tutorial, we will look at what data preprocessing is and how it can be done. We will rely on the data (accessible from OpenfMRI here) from Poldrack et al (2001) study on how memory systems compete in the human brain during classification tasks.

fMRI data preprocessing is a necessary step before any analysis can be done. The goal is to ensure there is physiological noise or artifacts from movement in the data, and, as such, this step involves applying different filters and masks to limit confounding. Many studies have emphasized the importance of preprocessing techniques arguing, for example, for minimizing covariance with movement parameters (Power et al., 2012) or removing unrecognized brain signals (Macey et al., 2004).

However, there seems to be a lack of consistency in the preprocessing techniques used across different studies, even if on the same topic (Andronache et al., 2013). fmriprep aims to provide standardized preprocessing supposed to do only the necessary minimum of processing and filtering of raw data and, thus, encourage reproducibility and open science. We will use fmriprep to preprocess Poldrack et al (2001) data to see what it actually does and how to interpret it.

Since we are dealing with raw fMRI data, a consumer-grade computer and the computational power constraints that come with it, we will only look at one subject data. Before we can use fmriprep and examine its output, however, we need to install it. For this, in Part 1, we first deploy Docker, a popular container technology, and then use it to run fmriprep. We also discuss some of the hurdles common during the installation process and how to overcome them.

After getting the tools working and the data preprocessed, in Part 2, we will examine the fmriprep output. If all goes well, you will be able to tell what preprocessing steps fmriprep took and what they mean. Understanding this is useful for any future fMRI data analysis you will do next.

The tutorial builds on a number of already available resources you can find in the References section. Our hope is that it will provide a comprehensive, all-in-one-place guide to all the necessary steps for getting through fMRI data preprocessing.

Part 1: Dataset & Installations

Here we install fmriprep using Docker. Feel free to skip to the next section on Output & Analysis if you have everything working already.

Docker

Docker is a tool that essentially virtualizes your operating system and allows you to run applications using containers without having to clutter your machine with all the libraries and dependencies. Docker is the recommended way of executing fmriprep if you are using your own computer.

If you are a Mac user like me, follow the steps below to install it. Everyone else — sorry, other systems are beyond me, but check out the official documentation here.

Make sure your macOS is updated (at the time of writing, this means of version 10.13 at the very least). You also need at least 4 GB of RAM, so plan your other memory-consuming work ahead.
Before you can download Docker, you need to create a free account on Docker Hub. Once you have your Docker ID, follow this link to download Docker Desktop for your Mac.
You should now have Docker.dmg in your Downloads. Follow the regular “installing applications on Mac” process and drag Docker to your Applications. After installation is complete, open the app and log in with your Docker ID. You now have Docker running!
To verify that your Docker is operating properly, open the Terminal and run a simple Docker command to test your installation.

$ docker run hello-world

5. Your output should say “Hello from Docker! ”. If that’s the case, you are all good to go on to fmriprep. If not, consider reinstalling or debugging using the official troubleshooting resources.

fmriprep

As we said, fmriprep is used for preprocessing data; the main idea behind using fmriprep is that it ensures reproducible results by not processing the data excessively. We will explore what steps fmriprep does perform once we have some output to show; for now, let’s make sure we can run it.

fmriprep relies on FreeSurfer, open-source software for structural MRI image processing, to do its anatomical brain reconstructions. This requires a free FreeSurfer license which you can get here. You will need to register using a valid email to which the license will then be sent. After receiving the license file, save it in the same directory where you want all your fmriprep-related output to live. This step is fairly quick, so do not let it discourage you from using fmriprep.
Use the following command to run fmriprep using Docker.

pip install --user --upgrade fmriprep-docker

3. Now we are ready to get our fmriprep hands on the data. All you need is just one command (that takes at least 8 hours to execute).

fmriprep-docker /Users/gelanatostaeva/Desktop/sub_1 /Users/gelanatostaeva/Desktop/sub_1/output participant --participant-label 01 --fs-license-file /Users/gelanatostaeva/Desktop/sub_1/license.txt

In this command, we call fmriprep-docker, then provide the path to our single-subject data /Users/gelanatostaeve/Desktop/sub_1, followed by the path to where we want the output to be stored /Users/gelanatostaeve/Desktop/sub_1/output. The keyword participant is there to indicate the level of analysis we want; we also specify which participant we want: --participant-label 01. Finally, we have to point to the FreeSurfer license file by entering its path: --fs-license-file /Users/gelanatostaeve/Desktop/sub_1/license.txt

4. Wait for fmriprep to finish running to see this output. Patience is key here.

You got through all the installations without any problems? Good for you — you can move to the next Part on Output & Analysis.

Otherwise, fear not. We, too, faced a couple of hurdles which we summarize below:

you have Docker, fmriprep, and Python installed and run the pip install command for fmriprep-docker, yet your computer is telling you fmriprep-docker command is not found.

This is probably a PATH problem, not you. First, figure out what the path Docker is in:

which fmriprep-docker

You could also try this other command. Don’t worry: the “yes n” part of this command ensures it will not delete fmriprep-docker:

yes n | pip uninstall fmriprep-docker | grep bin

Now that you know the path it should be in, you should change your PATH to include it. You can do so by running:

export PATH=$PATH: /….echo $PATH

For example, which fmriprep-docker yields /Users/gelanatostaeva/.local/bin//fmriprep-docker. The following command should then fix the problem:

export PATH=$PATH:/Users/gelanatostaeva/.local/bin/

2. fmriprep-docker is running but exits with memory errors

If you are running into memory issues, you might want to consider increasing memory resources available to Docker. Go to the Desktop app -> Preferences -> Advanced, set the maximum limit to 8 GiB and make sure to press Apply & Restart.

Note that this will slow down your computer significantly which might or might not be a good excuse to avoid all other work.

3. fmriprep-docker is running but exits with errors about something named BIDS.

BIDS is a simple way of organizing neuroimaging data designed to maximize adoption. This is what it looks like:

The data we are using should be BIDS-compliant. If you are following this tutorial and using the same dataset, this should not be a problem. If it is (or if you are using some other data), you can try to convert your data. You can do so using this package for datasets, like ours, taken from the OpenfMRI website.

Hopefully, you have fmriprep-docker running now. If you are still experiencing issues not covered here, consider giving Google search a try. For the purposes of this guide, we will now move on to the next big step — making sense of the output and doing some basic analysis.

Part 2: Output & Analysis

Hooray! After many hours, fmriprep-docker executed properly and made a nice summary HTML file you can find in the generated “fmriprep” folder. If you were not able to get the output, you can still follow along by viewing the pdf of our output file here.

All looks good and cool, but perhaps too fancy to understand just yet? Let’s break it down. In this section, we will first go over the steps fmriprep took to preprocess our data and then see how we can use the generated output to perform the independent component analysis. By the end of this section, we should be able to tell if our single-subject analysis is in line with the findings of Poldrack et al (2001).

summary HTML file

If you followed our instructions, your output file will have anatomical and functional reports for two versions of the task on declarative vs non-declarative memory, each with two trial runs.

Take your time looking through generated images for these two versions. Do not let all the jargon get you stuck. Most of it is not that crucial to understand the first time, but let’s define some things you should know:

structural vs functional: structural refers to imaging the brain structure, i.e., its anatomical properties; functional — the brain function, i.e., areas associated with a particular task.
white vs pial surface: the white surface outlines the border between white and gray matter, represented by the blue line, while pial — the border between gray matter and the cerebrospinal fluid, represented by the red line; it follows that we use the white and pial surface to quantify white matter volume and gray matter volume respectively.
all figures have three “rows” corresponding to the three anatomical planes: transversal, sagittal, and coronal (in that order, from top to bottom.

As we’ve said, fmriprep does only the necessary preprocessing, and what you see in the file is the result of that. What did fmriprep do exactly? We define the steps one-by-one.

Brain mask and tissue segmentation to locate the brain and its tissue types. This means separating the brain from unnecessary tissue around it (brain mask) and dividing the brain into parts by tissue types (tissue segmentation).
Normalizing the scans to be used as map reference for surface reconstruction (this is essentially the “computerized” representation of the brain we are working with for analysis) which concludes the anatomical preprocessing step. Spatial normalization makes all different brain scans map onto one another: since human brains are different, we need to normalize so that the locations are consistent.
The functional part starts with the obvious step — coregistration — to align the anatomical and functional. This involves overlaying functional (referred to as BOLD) data over the structural.
Finally, the last corrections are made to the functional images. These include a head motion correction, an important step to limit confounding from movement artifacts.

Alignment of functional and anatomical MRI data (surface driven).

You might be wondering how all of these steps could be considered the necessary minimum. In fact, there are several steps fmriprep left out. Some examples include cortex inflation or flattening which is mostly for better visualization since all the surface folding can be too much, especially if we want to look at the functional activity “inside”, i.e., behind all the folding. It also skipped slice timing correction, a trick to make all slices seem like they were done at the same time — a step potentially leading to interpolation errors.

Once you feel like you have a general understanding of your now preprocessed data, you are ready to move on to the fun part of data analysis. We will not explore this here, but check out this post on using neural networks for functional connectivity analysis of (other, preprocessed!) fMRI data.

For a sneak-peek into data analysis, we can use Nilearn, a Python module for neuroimaging data, to visualize our preprocessed data. You can then go on to data the analysis using these images in a computer-friendly format of numbers and arrays.

Plotting the BOLD signal from one of the tasks in Poldrack et al. (2001) using one subject.

Here’s the code snippet to get there:

##### Importing the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from nilearn import image, plotting, datasets
import nibabel as nib ##### Loading the data using nibabel
data = nib.load('sub-01_task1_bold.nii.gz')
##### Getting the 3D image from the 4D datafirst_data = image.index_img(data1, 0) ##### Using Nilearn's build-in function to get Echo-planar imaging plot (supposed to minimize patient's motion)
plotting.plot_epi(image.mean_img(first_data1))
plotting.show()

This tutorial focused on data preprocessing, a crucial step before fMRI data analysis. We run fmriprep, the minimal preprocessing pipeline using a Docker container, and tried to make sense of the output. Understanding preprocessing techniques used is helpful in interpreting the results of the analysis. For one, if you know you applied too many filters, you will not be surprised to find brain activity in a dead salmon fish.

Hopefully, you now feel more confident in the preprocessing tools and concepts we explored and excited to do some analysis. Until next time!

References

Andronache, A., Rosazza, C., Sattin, D. P., Leonardi, M., D’Incerti, L., & Minati, L. (2013). Impact of functional MRI data preprocessing pipeline on default-mode network detectability in patients with disorders of consciousness. Frontiers in neuroinformatics, 7, 16.Chicago

De Beeck, H. P. O., Haushofer, J., & Kanwisher, N. G. (2008). Interpreting fMRI data: maps, modules and dimensions. Nature Reviews Neuroscience, 9(2), 123.

Esteban, O., Markiewicz, C. J., Blair, R. W., Moodie, C. A., Isik, A. I., Erramuzpe, A., … & Oya, H. (2019). fMRIPrep: a robust preprocessing pipeline for functional MRI. Nature methods, 16(1), 111.

fMRIPrep tutorial: Running the docker image | Stanford Center for Reproducible Neuroscience. (2019). Retrieved 25 October 2019, from http://reproducibility.stanford.edu/fmriprep-tutorial-running-the-docker-image/

Macey, P. M., Macey, K. E., Kumar, R., & Harper, R. M. (2004). A method for removal of global effects from fMRI time series. Neuroimage, 22(1), 360–366.

NMR Phenomenon. Retrieved 26 October 2019, from http://mriquestions.com/hellipthe-nmr-phenomenon.html

Poldrack, R.A., Clark, J., Paré-Blagoev, E.J., Shohamy, D., Creso Moyano, J., Myers, C., Gluck, M.A. (2001). Interactive memory systems in the human brain. Nature, 414(6863):546–50

Pooley, R. A. (2005). Fundamental physics of MR imaging. Radiographics, 25(4), 1087–1099.

Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L., & Petersen, S. E. (2012). Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. Neuroimage, 59(3), 2142–2154.Chicago