Back to AI Research

AI Research

Boosting Brain-to-Image Decoding with TRIBE v2 Data... | AI Research

Key Takeaways

  • Boosting Brain-to-Image Decoding with TRIBE v2 Data Augmentation Brain-to-image decoding—the process of reconstructing or identifying images based on a perso...
  • Brain decoding is limited by the availability of labeled neural data, and remains challenging in low-data regimes.
  • To address this issue, we investigate whether and when brain decoding can be boosted by augmenting small fMRI datasets with synthetic data generated by a pretrained model of fMRI responses to stimuli.
  • We use TRIBE v2, a large encoding model pretrained on more than 1000 hours of fMRI responses to video, audio and language.
  • For each dataset, we evaluate systematic grids that show how the performance of image decoders varies with the amount of synthetic data used for training.
Paper AbstractExpand

Brain decoding is limited by the availability of labeled neural data, and remains challenging in low-data regimes. To address this issue, we investigate whether and when brain decoding can be boosted by augmenting small fMRI datasets with synthetic data generated by a pretrained model of fMRI responses to stimuli. We use TRIBE v2, a large encoding model pretrained on more than 1000 hours of fMRI responses to video, audio and language. For each dataset, we evaluate systematic grids that show how the performance of image decoders varies with the amount of synthetic data used for training. Our results, based on two datasets (the 7T fMRI Natural Scenes Dataset and 3T fMRI BOLD5000), show up to 68% improvement in Top-10 image-retrieval accuracy compared to decoders trained only on real data. Importantly, the proportion of augmented data required to reach a given image decoding performance needs to be adjusted depending on the data source. Surprisingly, image decoders trained exclusively on synthetic fMRI can perform above chance in some settings, suggesting that TRIBE v2 can support zero-shot brain-to-image decoding. Together, these results show how large-scale models of the fMRI responses to sight, sound and language may provide a foundation to improve the data efficiency for image decoding.

Boosting Brain-to-Image Decoding with TRIBE v2 Data Augmentation
Brain-to-image decoding—the process of reconstructing or identifying images based on a person’s brain activity—is currently limited by the high cost and time required to collect large amounts of neural data. Because modern decoders require thousands of stimulus-response pairs, they are often inaccessible to most research laboratories. This paper investigates whether this data bottleneck can be overcome by using synthetic data. By leveraging TRIBE v2, a large-scale model pretrained on over 1,000 hours of fMRI responses to sight, sound, and language, the researchers demonstrate that they can generate synthetic brain responses to new images to supplement limited real-world datasets.

How the Approach Works

The researchers use TRIBE v2 as a "synthetic fMRI generator." Even though TRIBE v2 was not originally trained on static images, the team adapted it by treating images as short, static videos. They then created an "operating grid" to test different mixtures of real and synthetic data. By keeping a small percentage of real fMRI data and adding a calculated amount of synthetic data generated by TRIBE v2, they trained decoders to map brain activity to image representations. This allows the model to learn from a much larger pool of stimuli than what was originally recorded in the lab.

Key Results

The study tested this method on two major fMRI datasets: the 7T Natural Scenes Dataset (NSD) and the 3T BOLD5000 dataset. The results show that adding synthetic data significantly improves performance in low-to-medium data regimes. Specifically, the researchers observed up to a 68% improvement in Top-10 image-retrieval accuracy compared to decoders trained only on real data. In some cases, this approach allowed researchers to reach 90% of the performance of a full-data model while using only a fraction of the actual scanning time—potentially saving hours of expensive fMRI data collection per subject.

Zero-Shot Potential and Limitations

A surprising finding is that in some settings, decoders trained exclusively on synthetic fMRI data performed above chance levels. This suggests that TRIBE v2 possesses a degree of "zero-shot" capability, meaning it can translate visual information into brain-like signals even without seeing specific real-world brain data for those images.
However, the authors emphasize that this augmentation is not a "plug-and-play" solution. The benefits are highly dependent on the specific dataset and the type of decoder used. Furthermore, the performance gains eventually saturate; adding too much synthetic data can stop helping or even hinder the model. Therefore, careful calibration of the ratio between real and synthetic data is essential for achieving the best results.

Comments (0)

No comments yet

Be the first to share your thoughts!