论文网址:pdf (openreview.net)
英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用
目录
1. 省流版
1.1. 心得
1.2. 论文总结图
2. 论文逐段精读
2.1. Abstract
2.2. Introduction
2.3. Related work
2.4. Methods
2.4.1. Datasets and preprocessing
2.4.2. Model architecture & training procedure
2.4.3. Clinical variable prediction
2.5. Results
2.5.1. Model generalization
2.5.2. Prediction of clinical variables
2.5.3. Prediction of future brain states
2.5.4. Interpretability via attention analysis
2.5.5. Functional network prediction
2.6. Discussion
3. Reference
1. 省流版
1.1. 心得
(1)好简单的模型啊...
1.2. 论文总结图
2. 论文逐段精读
2.1. Abstract
①Model name: Brain Language Model (BrainLM)
②Recording: 6700 hours fMRI
③Supervision method: self-supervised
④⭐Task: extracting functional connectivity (FC) without supervised network
2.2. Introduction
①⭐Previous work focus on specific and narrow task
②Plight: large amount unlabeled fMRI data
③Method of BrainLM: Transformer based
④Ability of BrainLM: prediction of future brain states, decoding cognitive variables, and discovery of functional networks
⑤Overview of BrainLM:
with 77298 samples and 6700 hours, they pretrained BrainLM by spatiotemporal masking and reconstruction
myriad n. 无数,大量;(多用于古典历史剧中)一万 adj. 无数的,大量的
2.3. Related work
作者觉得其他的要么太focus on specific task了要么就样本量太小,对于大语言模型的话其他工作也主要是再寻找brain recordings的表征相似性(我不知道为什么要找表征相似性我不是这个领域的)
2.4. Methods
2.4.1. Datasets and preprocessing
①Datasets: the UK Biobank (UKB) with 76,296 rs-fMRI recordings and the Human Connectome Project (HCP) with 1002 fMRI data
②80% UKB data for training. 20% UKB data and all the HCP data for testing.
③Preprocessing: standard
④Atlas: AAL-424
2.4.2. Model architecture & training procedure
①Task: predict the original signal of masked patches
②BrainLM:
③Training: randomly select 200 time points in each fMRI data, and divide them into 10 sections with 20 time points each. Converting each section to vector with 512 dimension, masking them as 20%, 75%, and 90%(我猜测是N*10个section中随机mask20%,75%或者90%)
④Order of ROI: change the order of ROI to the real y-axis of the ROI in brain based order
⑤Model framework: constructed by 4 self-attention layers and 4 heads for training unmasked data, and 2-layer Transformer decoder for predicting masked and unmasked vectors
⑥Batch: 512
⑦Optimizer: Adam
⑧Epoch: 100
⑨Goal: minimizing the MSE of original signal and reconstructed signal(只比较Mask部分)
2.4.3. Clinical variable prediction
①Enchancement of prediction: adding 3-layer MLP head in encoder
②Regression task: age, neuroticism, PTSD, and anxiety disorder scores
③Approach:
age | Z-score normalization |
neuroticism | min-max scaling to [0, 1] |
PTSD (PCL-5) and anxiety disorder (GAD-7) scores | distributeb them exponentially by log transformation |
④Dropout rate: 10% for encoder and MLP head
2.5. Results
2.5.1. Model generalization
①The reconstruction performance on UKB and HCP. The red lines denote predicted result and the black points are the real recording:
(HCP是拿来证明泛化能力的)
2.5.2. Prediction of clinical variables
①Reconstruction performance:
②Latent encoding learning:
③Performance table:
delve vi. 钻研;探究;挖 vt. 钻研;探究;挖 n. 穴;洞
2.5.3. Prediction of future brain states
①They applied 180 time steps to train and 20 following to test
②MSE on each time step:
2.5.4. Interpretability via attention analysis
①Mean attention socre on each ROI:
glean vt. 收集(资料);拾(落穗) vi. 收集;拾落穗
2.5.5. Functional network prediction
①7 subnetworks: visual, somatomotor, dorsal attention, ventral attention, limbic, frontoparietal, and default mode networks
②Region segmentation comparason table:
2.6. Discussion
①Predicting masked distribution
②Predicting mental disorders (?)(把解码的数据送去卷积?还是直接就有结果啊?)
③Recognizing FC(哪里?怎么感觉像脑区分割呢)
3. Reference
Caro, J. O. et al. (2024) 'BrainLM: A foundation model for brain activity recordings', ICLR.