💻 🧠 Contact school.brainhack@gmail.com for questions! 🧠 💻

fMRI Stats Exploration

By Maximilien Le Clei
Published on June 19, 2025

"This project aimed to further my intuitive understanding of fMRI data. Around 20 interactive/static figures of various statistics of raw fMRI data, confounds and atlased data were produced. Special efforts have been made to make the analysis highly and easily reproducible."

Deliverables

Project description

The project represents my own personal first experience handling neuroimaging data.

Coming from a Deep Learning background, this project represents the type of analysis I would undertake in order to get a better feel of what exactly am I trying to train models on.

One extremely important element of training DL models is to make sure that the distribution of the data is well-suited for modeling. This generally means having the data take shape of a standard or uniform distribution with values concentrated between -1 and 1. I observed that original fMRI data does not necessarily fit those criteria and that the distribution between subjects can also vary substantially (Figure a). This distribution looks quite unnatural to fit back into one of the mentioned distributions, however, Figure q demonstrates that it is possible.

I then wanted to get a better feel of what does that original data looks like in 3D (Figure b & c) before taking a look at how does the data vary across time. In Figure d, I take a look at the overlapping differences between two collected timepoints in order to get a better feel of what are supposedly movements by the subject in the scanner. In Figure e, I generalize that to the entire series of timepoints to get a feel of where variance is most present, which we notice to be at the edges of the brain, seeming to correlate to those same brain movements (Note: I did not account for the fact that the data is in its original distribution at that point in time which could have biased my observations).

Next, one important element that I wanted to get more understanding of is the various transformations applied on top of the raw fMRI data. Indeed, AI practitioners are often wary of transformations being applied to data prior to feeding it to AI models as those transformations can greatly impact the model’s ability to learn something valuable. I thus took a closer at the applications of confounds. I first extracted the data that gets added to the signal as a result of applying those very confounds and visualized it in 3D (Figure f & g). It very much looks like noise though with a little bit of a pattern of lower values around the edges of the brain.

I then moved on to taking a closer look at the 3D difference between the original data (Figure a), the data with confounds applied (Figure h) and the data with confounds and a threshold applied (as demonstrated in the course) (Figure i). Visibly, applying the confounds looks like it is simply getting rid of the data where no brain data is ever collected, while the thresholding observed in class is more agressive and appears to chunk out a lot of the data (some seemingly relevant? not sure, too new at this hehe).

I then went back to observing variance across time as done in Figure e but this time with the confounds and threshold applied (Figure j). In that visualization we observe that the largest variance no longer only occurs at the edge of the brain but also where the white matter is supposedly located, seeming to indicate that the head movement was somewhat corrected (Also note by dragging the cursor that the variance is generally much lower in general: 2 vs 3 orders of magnitude).

At that point I wanted to quickly go back to connectivity matrices to see the effects of the confounds (Figure k applied, Figure l not applied). Because while the application of confounds do not amount to much information when visualizing in 3D, they do appear very important when converting to atlases and computing connectivity matrices.

From then on, I somewhat concluded that the intuition that I was building in the original 3D space was not fully transferable to the parcel space.

I thus moved on to the Algonauts data and started with taking a look at their 3D -> parcel space transformation, orchestrated through the Schaefer 2018 Atlas [3] (Figure m).

I then went on to selecting a given subject and picking a given scene (Figure n) and taking a look at the resulting fMRI data projected back into 3D (Figure o & p). This looks completely different from what I had been working on so far. The distribution of the data looks ameneable to train models with (Figure p & q) and the different brain regions have clear differences in activation values. A correlation matrix computed on that series of data (Figure q) seals the deal with clear positive and negative correlation between brain regions which is to be expected.

On my end, a simple question remains unanswered: how did we go from such a noisy signal to such a clear cut signal? It seems like decades of work from the scientific community has gone into those transformations, and I look forward to learning more about those as I get more familiar with fMRI brain data :)

Data Used

  • Nilearn’s brain development dataset [1]
  • Algonauts 2025 dataset [2]

Tools Used

Development

  • Visual Studio Code
  • Python
  • Jupyter notebooks
  • Git/GitHub

Containerization

  • Docker
  • Dev Containers
  • Docker Hub

Plotting

  • Matplotlib
  • Bokeh
  • Plotly

Neuroimaging

  • Nilearn
  • Nibabel

References

[1] Hilary Richardson, Grace Lisandrelli, Alexa Riobueno-Naylor, and Rebecca Saxe. Development of the social brain from age three to twelve years. Nature communications, 9(1):1–12, 2018.

[2] Gifford, Alessandro T., et al. “The Algonauts Project 2025 Challenge: How the Human Brain Makes Sense of Multimodal Movies.” arXiv preprint arXiv:2501.00504 (2024).

[3] Alexander Schaefer, Ru Kong, Evan M Gordon, Timothy O Laumann, Xi-Nian Zuo, Avram J Holmes, Simon B Eickhoff, and B T Thomas Yeo. Local-Global Parcellation of the Human Cerebral Cortex from Intrinsic Functional Connectivity MRI. Cerebral Cortex, 28(9):3095–3114, 07 2017

See also these similar projects

Decoding Perceived Emotion from BOLD data using Machine Learning

This project applies machine learning to decode perceived emotions from fMRI data using ROI-based features. Data from the ds003548 OpenNeuro dataset …

Discover this project

Visualization of functional connectivity from multiple neuroimaging modalities

In this project I employed some of the tools we learned at the Brainhack school to generate interactive figures to display functional connectivity …

Discover this project

Practice extracting functional signals from specific brain region

This project aims to extract and analyze fMRI signals from the hippocampus during spatial navigation, using a reproducible workflow based on open …

Discover this project