[论文精读]BrainLM: A foundation model for brain activity recordings

论文网址：pdf (openreview.net)

英文是纯手打的！论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误，若有发现欢迎评论指正！文章偏向于笔记，谨慎食用

1. 省流版

1.1. 心得

1.2. 论文总结图

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Related work

2.4. Methods

2.4.1. Datasets and preprocessing

2.4.2. Model architecture & training procedure

2.4.3. Clinical variable prediction

2.5. Results

2.5.1. Model generalization

2.5.2. Prediction of clinical variables

2.5.3. Prediction of future brain states

2.5.4. Interpretability via attention analysis

2.5.5. Functional network prediction

2.6. Discussion

3. Reference

1. 省流版

1.1. 心得

（1）好简单的模型啊...

1.2. 论文总结图

2. 论文逐段精读

2.1. Abstract

①Model name: Brain Language Model (BrainLM)

②Recording: 6700 hours fMRI

③Supervision method: self-supervised

④⭐Task: extracting functional connectivity (FC) without supervised network

2.2. Introduction

①⭐Previous work focus on specific and narrow task

②Plight: large amount unlabeled fMRI data

③Method of BrainLM: Transformer based

④Ability of BrainLM: prediction of future brain states, decoding cognitive variables, and discovery of functional networks

⑤Overview of BrainLM:

with 77298 samples and 6700 hours, they pretrained BrainLM by spatiotemporal masking and reconstruction

myriad n. 无数，大量；（多用于古典历史剧中）一万 adj. 无数的，大量的

2.3. Related work

作者觉得其他的要么太focus on specific task了要么就样本量太小，对于大语言模型的话其他工作也主要是再寻找brain recordings的表征相似性（我不知道为什么要找表征相似性我不是这个领域的）

2.4. Methods

2.4.1. Datasets and preprocessing

①Datasets: the UK Biobank (UKB) with 76,296 rs-fMRI recordings and the Human Connectome Project (HCP) with 1002 fMRI data

②80% UKB data for training. 20% UKB data and all the HCP data for testing.

③Preprocessing: standard

④Atlas: AAL-424

2.4.2. Model architecture & training procedure

①Task: predict the original signal of masked patches

②BrainLM:

③Training: randomly select 200 time points in each fMRI data, and divide them into 10 sections with 20 time points each. Converting each section to vector with 512 dimension, masking them as 20%, 75%, and 90%（我猜测是N*10个section中随机mask20%，75%或者90%）

④Order of ROI: change the order of ROI to the real y-axis of the ROI in brain based order

⑤Model framework: constructed by 4 self-attention layers and 4 heads for training unmasked data, and 2-layer Transformer decoder for predicting masked and unmasked vectors

⑥Batch: 512

⑦Optimizer: Adam

⑧Epoch: 100

⑨Goal: minimizing the MSE of original signal and reconstructed signal（只比较Mask部分）

2.4.3. Clinical variable prediction

①Enchancement of prediction: adding 3-layer MLP head in encoder

②Regression task: age, neuroticism, PTSD, and anxiety disorder scores

③Approach:

age	Z-score normalization
neuroticism	min-max scaling to [0, 1]
PTSD (PCL-5) and anxiety disorder (GAD-7) scores	distributeb them exponentially by log transformation

④Dropout rate: 10% for encoder and MLP head