Finding The Connection Between Memory Encoding and Retrieval

10 min readNov 5, 2023

Hi, I’m Tamnapat Wongtanutam, a student from class D of BrainCodeCamp, an online hands-on workshop that focuses on developing students’ skills and understanding in programming to understand the relationship between behavior and brain signals. Throughout the 12 weeks in this camp, I have learned lots of new lessons about computational neuroscience and had the opportunity to complete a mini-project. In this blog post, I’ll take you through my journey behind this project and reveal the final outcome.

– Background
– Introduction
– Dataset
– Objective
– Analysis Process
– Creating Result Model
– Result / Conclusion
– Future Directions
– Reflection

Background

The reason I joined BrainCodeCamp is that I was exploring new ways to expand my programming knowledge and computational neuroscience seems to be a very interesting subject for me. Even though I have lots of experience in programming, neuroscience is a very new thing to me so I didn’t have much knowledge in this area. When I was asked to determine a topic for my mini-project, I was really confused because I didn’t have a specific area of interest in this field. Fortunately, the camp provided a list of suggested project ideas and it helped me a lot. The reason I picked this topic is because it was the most eye-catching one for me on that list and I knew that I would enjoy researching about it.

Introduction

Every day, we create new memories from what we see, store them inside our minds, and sometimes share them with others. When different people collect and share memories, is there a relationship in their neural representations of these two processes? To answer this question, I analyzed a set of fMRI (Functional Magnetic Resonance Imaging) data involved with memory encoding and retrieval.

Dataset

The dataset used in this project was from a study published by Janice Chen, Yuan Chang Leong, Christopher J. Honey, Chung H. Yong, Kenneth A. Norman, and Uri Hasson (Chen, J., Leong, Y. C., Honey, C. J., Yong, C. H., Norman, K. A., & Hasson, U. (2017). Shared memories reveal shared structure in neural activity across individuals. Nature Neuroscience, 20(1), 115–125. https://doi.org/10.1038/nn.4450).

To understand how people’s brains store and recall memories, researchers conducted a study that involved humans engaging in memory-related activities while undergoing functional MRI scans. In this study, participants were required to complete two specific tasks: watching a movie and verbally recalling details from it.

Participants
22 participants were recruited from the Princeton community (12 male, 10 female, ages 18–26, mean age = 20.8). All participants were right-handed native English speakers, reported normal or corrected-to-normal vision, and had not watched any episodes of Sherlock prior to the experiment.

Task 1: Movie viewing
Participants watched the first 50 minutes of Episode 1 of BBC’s Sherlock in an MRI data scanner (3T full-body scanner (Siemens Skyra) with a 20-channel head coil). The movie was projected using an LCD projector onto a rear projection screen located in the magnet bore and viewed with an angled mirror. The audio was delivered via in-ear headphones.

BBC’s Sherlock (Source: https://gem.cbc.ca/sherlock/s01)

Task 2: Verbal recollection
In the scanner immediately after the movie-viewing session ended, the participants were instructed to describe aloud what they recalled of the movie in as much detail as they could, to try to recount events in the original order they were viewed in, and to speak for at least 10 minutes if possible but that longer was better. There was no interaction between the participant and the experimenter until the end of the scan.

Data preprocessing
The dataset was initially preprocessed by the publisher. The preprocessing steps were conducted using FSL and involved various procedures such as slice time correction, motion correction, linear detrending, high-pass filtering, and coregistration of functional volumes to a standard template brain (MNI standard). Functional images were resampled to 3 mm isotropic voxels. The motion was minimized by instructing participants to stay still during speech and using foam padding to stabilize their heads. Data were sorted across time at every voxel, files were cropped so that all movie-viewing data were aligned across subjects, and all recall data were aligned to the scene timestamps provided.

Objective

Once I understood the purpose of the dataset, I made a goal that by the end of this project, I would find out the answer to my research question: is there a relationship between how our brains function when we create memories and when we later recall those memories? I also want the results should be shown in a model that is simple and easy to understand. To reach this objective, I have to learn the techniques used to analyze fMRI data and create data models.

Analysis Process

As mentioned above, to reach my objective, I would have to analyze fMRI data. However, since I didn’t have any experience with neuroimaging data, this was a very challenging task for me and I had to go through many trials and errors. In this section, I will take you through my process of analyzing fMRI data to find the answer to my research question.

Step 0: Downloading data
Downloading the dataset seems to be a very basic step needed in order to access the data so I thought it shouldn’t even be considered a step (so I labeled it “step 0"). However, my process of downloading this dataset is quite long and painful so I wanted to talk about it.

At first, I thought that downloading the dataset should be a really easy task. However, It was a lot more challenging than I expected. The preprocessed data from the scan were separated into two folders: movie_files and recall_files. These two folders were initially formatted in the form of gzip compressed tar archive (.tgz) and had a total size of 16.22 GB. I didn’t think it would take long to download these files, however, my browser somehow estimated that it would take 15 hours so I downloaded them overnight using Amphetamine. In the morning, the download wasn’t even completed yet so I left it for another day. On that evening, the download was complete so I unzipped the file. While unzipping, I found out that there wasn’t enough space on my laptop so I cleared out lots of files on my disk. After a long and painful process of deleting files, I was finally able to unzip a total of 69.3 of data into my personal laptop. To access these data on Google Collab, I uploaded these files onto Google Drive and I was then prepared to perform analysis of these data.

Step 1: Visualizing data
After I downloaded the dataset, I wanted to visualize the data to see how they looked before analyzing them. I barely know how to visualize 4D neuroimaging data so it took a lot of help for my class’s TAs for me to understand. First, I loaded the data using NiBabel library, converted it into a NumPy array, and plotted it out using Matplotlib.

Step 2: Creating one image data for each scene
fMRI data from movie-viewing and verbal-recollection sessions were organized for each scene, having many 3D images per scene. Since I wanted to later compare data from scenes, I needed to consolidate the data into a single 3D image per scene. To do this, I used ‘numpy.mean’ function to average the data for each scene based on the timestamps.

# Import libraries
import nibabel as nib
import numpy as np

# Load neuroimaging data using NiBabel
data = nib.load(data_path)
 
# Convert data to NumPy array using np.asanyarray
data_np = np.asanyarray(data.dataobj)

# Average data for each scene's time range (tr1, tr2) using np.mean
data_avg = np.mean(data_np[:, :, :, tr1:tr2+1], axis=3)

Step 3: Finding similarities between matching scenes
To find the answer to my research question, I needed to compare the data from when a participant was watching a movie and when a participant was recalling that specific point. To do this, I paired the 3D image data from movie_files with the 3D image data from recall_files in the same scene, from the same participant. Then, I calculated the Pearson correlation coefficient using ‘numpy.corrcoef.’ This function expects input from a 1D NumPy array, so I had to flatten the data using ‘ndarray.flatten’.

# Convert 3D data to 1D using np.ndarray.flatten
movie_1d = movie_data.flatten()
recall_1d = recall_data.flatten()
 
# Find the correlation between 1D movie and recall data using np.corrcoef
correlation = np.corrcoef(movie_1d, recall_1d)[0, 1]

I only computed the correlation for every other participant because this process really takes a lot of time and computing power. After finding the correlation for every matching scenes, I gathered the results on a spreadsheet.

Spreadsheet of the correlation between matching scenes

Step 4: Finding similarities between unmatched scenes
As you can see from the results above, the correlation between data from watching a movie and recalling the same scene seems to be very low so I thought that there was no relationship between them. However, one of the professors in this camp instructed me that a low correlation doesn’t mean no relationship. To find out whether there is a relationship, I need to find out whether the correlation is coincidental.

To determine whether these correlations occur by coincidence, I calculated the correlations for every “unmatched scene” in a participant by pairing the 3D image data from recall_files with 3D image data from movie_files of every scene that the participant is not talking about.

Spreadsheet of the correlation between unmatched scenes

Creating Result Model

To complete my objective, I have to show the results in a model that is simple and easy to understand. To do this, I chose to only highlight the average correlation for matching and unmatched scenes for each participant.

average correlation for matching and unmatched scenes for each participant

I decided to show the average correlations in a scatter plot because I feel that it clearly shows the significant difference between the two sets of results I’m highlighting. At first, I plotted the data out using Google Sheets, but my TA recommended that I include an error bar on my plot so I switched to Matplotlib. The error bars were determined from the Standard Error values of all data in that participant.

# Import library
import matplotlib.pyplot as plt

# Data for x-axis
x = [1, 3, 5, 7, 9, 11, 13, 15, 17]
x_error = [1.006, 3.006, 5.006, 6.99, 9.014, 11.002, 13.014, 15.006, 17.006]

# Data for two sets of y-values
y1 = [0.1534, 0.1019, 0.1356, 0.1042, 0.1546, 0.1221, 0.1458, 0.1449, 0.1483]
y2 = [0.0156, 0.013, 0.0099, 0.0121, 0.0119, 0.0114, 0.0122, 0.0245, 0.0104]

# Standard Error Values
y1_error = [0.0188, 0.0135, 0.0169, 0.0139, 0.0169, 0.0147, 0.0145, 0.0207, 0.0198]

# Set the width and height of the figure
fig = plt.figure(figsize=(8, 4))

# Create a scatter plot for the first set of data
plt.scatter(x, y1, label='Matching Scenes', color='red', marker='o', s=100)

# Add error bars to the scatter plot
plt.errorbar(x_error, y1, yerr=y1_error, fmt='none', ecolor='red', capsize=4)

# Create a scatter plot for the second set of data
plt.scatter(x, y2, label='Unmatched Scenes', color='black', marker='s', s=100)

# Customize the x-axis tick locations and labels
x_ticks = [1, 3, 5, 7, 9, 11, 13, 15, 17]
plt.xticks(x_ticks)

y_ticks = [0.00, 0.05, 0.10, 0.15, 0.20]
plt.yticks(y_ticks)

# Add vertical gridlines to the x-axis
plt.grid(axis='y', linestyle='--', linewidth=0.5)

# Customize the plot (optional)
plt.xlabel('Subject')
plt.ylabel('Pearson Correlation Coefficient')

# Add the legend outside the plot
plt.legend(loc='upper center', bbox_to_anchor=(0.1, 1.2))

# Display the plot
plt.show()

Result / Conclusion

This scatter plot shows the difference between the correlation of matching scenes and unmatched scenes. Model key:

Red dots: Average Pearson correlation coefficient of all matching movie and recall scenes in that participant
Black Squares: Average Pearson correlation coefficient of all unmatched movie and recall scenes in that participant
Error Bars: Standard Error values showing the variability of data

The model above shows that “there is indeed a similarity between neural activities when our brain encodes memory from an event and when we later rethink about that specific event”. Even though the similarity is not high, “it can be clearly proven that this similarity is not coincidental”.

Future Directions

This project concludes that there is a relationship between neural activities when our brain encodes memory from an event and when we later retrieve that same specific memory. The conclusion of this project can lead to further research and here are some of my ideas on how to extend this project:

Use techniques such as Searchlight Analysis to locate areas of similar neural activities when watching a movie and recalling it. Then, use those areas as a region of interest for the analysis.
Compare the data “across participants” to see if neural activities are also similar when different people watch and recall the exact same event.
Identify the pattern of how neural activity changes when encoding memories and recalling them. This can lead to a better understanding of how our brain works.

Reflection

During the time I spent working on this project, I faced many setbacks and challenges that pushed me to my limit. However, the TAs and professors in BrainCodeCamp are always supporting and giving me pieces of advice, which significantly helped me reach this point. This camp gave me an opportunity to learn lots of valuable lessons in the field of computational neuroscience. All the obstacles I encountered also helped to develop my problem-solving skills and broadened my experiences. I’m truly grateful for the opportunities this camp has offered and will surely continue to make use of the things I’ve learned.

Special Thanks: p’Joe, p’Gift, p’Punch, p’Pop, Prof. Gig, and everyone at BrainCodeCamp