Title
题目
Deep learning-based phenotyping reclassifies combined hepatocellular cholangiocarcinoma
基于深度学习的表型分类重新划分联合肝细胞胆管癌
01文献速递介绍
Primary liver cancer arises either from hepatocytic or biliary lineage cells, giving rise to hepatocellular carcinoma (HCC) or intrahepatic cholangio carcinoma (ICCA). Combined hepatocellular- cholangiocarcinomas (cHCCCCA) exhibit equivocal or mixed features of both, causing diagnostic uncer tainty and difficulty in determining proper management. Here, we perform a comprehensive deep learning-based phenotyping of multiple cohorts of patients. We show that deep learning can reproduce the diagnosis of HCC vs.CCA with a high performance. We analyze a series of 405 cHCC-CCA patients and demonstrate that the model can reclassify the tumors as HCC or ICCA, and that the predictions are consistent with clinical outcomes, genetic alterations and in situ spatial gene expression profiling. This type of approach could improve treatment decisions and ultimately clinical outcome for patients with rare and biphenotypic cancers such as cHCC-CCA.
原发性肝癌要么来源于肝细胞系,要么来源于胆道系细胞,
形成肝细胞癌(HCC)或肝内胆管癌(ICCA)。联合肝细胞-胆管癌(cHCC-CCA)展示了两者的模糊或混合特征,导致诊断不确定性以及确定适当管理措施的困难。在这里,我们对多个患者队列进行了全面的基于深度学习的表型分析。我们展示了深度学习能够以高性能复现HCC与CCA的诊断。我们分析了405例cHCC-CCA患者,并证明该模型可以将肿瘤重新分类为HCC或ICCA,并且预测结果与临床结果、遗传变异和原位空间基因表达分析一致。这种方法可能改善治疗决策,并最终提高如cHCC-CCA这类罕见且具有双表型的癌症患者的临床结果。
Results
结果
AI model performance in differentiating HCC and ICCA To investigate whether an AI model can re-classify cHCC-CCA tumors into “pure” HCC or ICCA categories, we trained an AI pipeline based on a self-supervised feature extractor11 with an attention-MIL aggre gation model12–14 (Fig. 1B) to distinguish pure HCCs (785 WSIs from n = 424 patients) from pure ICCAs (239 WSIs from n = 167 patients) (Methods, Supplemental Tables 1 and 2). In this cohort (“Discovery cohort”, Fig. 1C), the model achieved a cross-validated area under the receiver operator characteristic curve (AUROC) of 0.99 [ ± 0.01], corresponding to an almost perfect separability of the classes (Fig. 2A), reaching a sensitivity of 97.9% and specificity of 97.6%. As another piece of evidence for the plausibility of the model’s predic
tions, we subsequently evaluated the model on another patient cohort, the publicly available TCGA cohort, which was composed of n = 333 HCCs (TCGA-LIHC) and n = 27 ICCAs (TCGA-CHOL). The labels of the TCGA cohort were not seen by the model during train ing, however the training was exposed to some TCGA image data during self-supervised pretraining, which might affect an inter mediary result but not the subsequent results. We found that the model reached an AUROC of 0.94 [ ± 0.05], representing a very good generalizability to this additional dataset (Fig. 2B). Next, we asked which tissue structures were used by the model to make its predic tion and found that the model placed a high attention to areas with an ICCA-like phenotype (glandular structures and fibrous stroma) (Supplemental Fig. 1). Together, these data show that the AI model can robustly distinguish pure HCC from pure ICCA tumors (Fig. 2C) We used this model as the starting point for our subsequent experiments.
AI模型在区分HCC和ICCA的性能
为了研究AI模型是否能将cHCC-CCA肿瘤重新分类为“纯”HCC或ICCA类别,我们基于自监督特征提取器11和注意力-MIL聚合模型12-14(图1B)训练了一个AI流水线,以区分纯HCC(来自424名患者的785张WSI)和纯ICCA(来自167名患者的239张WSI)(方法,补充表1和2)。在这个队列(“发现队列”,图1C)中,模型达到了交叉验证的接收操作者特征曲线下面积(AUROC)为0.99 [± 0.01],对应于类别的几乎完美可分性(图2A),灵敏度达到97.9%,特异性为97.6%。作为模型预测合理性的另一个证据,我们随后在另一个患者队列上评估了模型,这个公开可用的TCGA队列包括333个HCC(TCGA-LIHC)和27个ICCA(TCGA-CHOL)。TCGA队列的标签在训练期间模型未见过,然而训练在自监督预训练期间曾接触过一些TCGA图像数据,这可能影响中间结果,但不影响后续结果。我们发现模型的AUROC为0.94 [± 0.05],表明该模型对这个额外数据集具有很好的普适性(图2B)。接下来,我们询问模型是如何利用组织结构进行预测的,并发现模型高度关注具有ICCA样表型的区域(腺体结构和纤维基质)(补充图1)。综合这些数据表明,AI模型可以稳健地区分纯HCC和纯ICCA肿瘤(图2C)。
我们使用这个模型作为后续实验的起点。
Method
方法
This study reports a retrospective analysis of tissue samples of archival tissue of primary liver tumors which was collected in a multicentric way. The protocol was approved by the review board of Université Paris Est Creteil, France (ID n° APHP22012), conducted in accordance with the Declaration of Helsinki and the legislations of each partici pating center. In this international multicentric cohort informed con sents were obtained from patients when required by local regulations. Centers with informed written consent obtained: Hamburg, Barcelona, Mondor, Chinese University of Hong Kong, Beaujon, Paul Brousse. Centers with waiver of consent after IRB approval: University of Texas Southwestern, Stanford, Aachen, Pitié-Salpêtrière, Michigan Uni versity, Chennai, Rouen, Saint Antoine, Lille, Angers, Milano, Amiens, Hong Kong, Poitiers, St Louis University, Seoul National University College of Medicine, Prince of Songkhla, Montpellier, Brest, Reims, Yale School of Medicine, Bachmai Hospital, Mayo Clinic Rochester,Regensburg.
本研究报告了一项对存档的原发性肝脏肿瘤组织样本进行的回顾性分析,这些样本是通过多中心方式收集的。该协议经法国克雷泰伊大学巴黎东校区审查委员会批准(ID号APHP22012),并按照赫尔辛基宣言及各参与中心的立法进行。在这个国际多中心队列中,当地法规要求时,已从患者处获得知情同意。
获得书面知情同意的中心包括:汉堡、巴塞罗那、蒙多尔、香港中文大学、博若杰、保罗·布鲁斯。
在IRB批准后免除同意的中心包括:德克萨斯大学西南医学中心、斯坦福大学、亚琛、皮提-萨尔佩特里耶、密歇根大学、钦奈、鲁昂、圣安托万、里尔、昂热、米兰、阿米恩、香港、普瓦捷、圣路易斯大学、首尔国立大学医学院、宋卡王子、蒙彼利埃、布雷斯特、兰斯、耶鲁医学院、巴赫迈医院、梅奥诊所罗切斯特、雷根斯堡。
Figure
图
Fig. 1 | Deep Learning-based classification of HCC versus ICCA. A Clinical pro blem: combined HCC-CCA tumors are a diagnostic dilemma and only poor evi dence is available to guide treatment in these patients. We hypothesize that an AI system can reclassify all cHCC-CCA cases into either category. B Technical approach: a two-step pipeline is used to transform image tiles into feature vectors by model 1 (M1), a pre-trained feature extraction model. A bag of feature vectors is subsequently aggregated into a call for a given patient by model 2 (M2), the aggregation model. C Experimental approach: a binary prediction model was trained to distinguish HCC from CCA and was evaluated via cross-validation and by external validation. Subsequently, the trained model was applied to a multicentric cohort of cHCC-CCA tumors and its predictions were comprehensively validatedwith a multimodal approach.
图1 | 基于深度学习的HCC与ICCA分类。A 临床问题:联合HCC-CCA肿瘤是一个诊断难题,目前只有少量证据可用于指导这些患者的治疗。我们假设一个AI系统可以将所有cHCC-CCA案例重新分类为其中一个类别。B 技术方法:使用两步流水线将图像块转换为特征向量,由模型1(M1),一个预训练的特征提取模型,完成。一包特征向量随后被模型2(M2),聚合模型,汇总为给定患者的一个调用。C 实验方法:训练了一个二元预测模型以区分HCC和CCA,并通过交叉验证和外部验证进行评估。随后,训练好的模型被应用于多中心的cHCC-CCA肿瘤队列,并通过多模式方法全面验证其预测。
Fig. 2 | Development of a deep-learning model for HCC/ICCA classification. A Receiver operator curve (ROC) for the internal validation of binary classification of HCC and iCCA cases. B ROC curve for the external validation of binary classifi- cation (HCC vs. ICCA) task on TCGA dataset. The error band shows the 1000 fold bootstrapped 95% confidence interval. C H&E slide of two randomly selected cases for HCC and ICCA. Attention map of the model and the class prediction scores are used as explainability methods to check the capability of the trained model in detecting the correct features within the WSI. The class prediction heatmap is weighted by the attention. Source data are provided as a Source Data file. This analysis was repeated independently with similar results five times.
图2 | 为HCC/ICCA分类开发深度学习模型。
A 用于HCC和iCCA案例二元分类内部验证的接收操作者曲线(ROC)。B 在TCGA数据集上对二元分类(HCC与ICCA)任务的外部验证ROC曲线。误差带显示了1000次自助法95%置信区间。C H&E染色的两个随机选定的HCC和ICCA病例幻灯片。模型的注意力图和类预测分数被用作解释方法,以检查训练模型在检测WSI内正确特征的能力。类预测热图由注意力加权。源数据以源数据文件形式提供。此分析独立重复了五次,结果相似。
Fig. 3 | Reclassification of combined hepatocellular-cholangiocarcinomas. A Example of a HE slide of a cHCC-CCA and its associated attention and prediction heatmaps. This case features relatively distinct HCC and ICCA components, both of which are identified on the attention maps (attention is however higher in ICCA areas). The class predictions match the HCC and ICCA morphological contingents.
This analysis was repeated independently with similar results five times. B Distribution of the raw outputs/predictions from the model: the scores follow a bimodal distribution, with a majority of cases peaking at a high HCC or ICCA. C Reclassification of cHCC-CCA as HCC or CCA has an impact on overall survival of patients treated by surgical resection. D Importantly, the prognosis value of the reclassification is validated in patients who underwent liver transplantation. E The model predictions match with the underlying alterations identified in cHCC-CCA (p = 0.0009, Fisher’s exact test). Statistical tests were two-sided and not adjusted for multiple testing. Source data are provided as a Source Data file
图3 | 联合肝细胞-胆管癌的重新分类。
A 一个cHCC-CCA的HE染色幻灯片示例及其相关的注意力和预测热图。此案例特征是HCC和ICCA组分相对明显,两者都在注意力图上被识别(尽管ICCA区域的注意力更高)。类预测与HCC和ICCA的形态特征相符。此分析独立重复了五次,结果相似。
B 模型的原始输出/预测分布:分数呈双峰分布,大多数病例在高HCC或ICCA处达到高峰。
C 将cHCC-CCA重新分类为HCC或CCA对接受手术切除治疗的患者的总生存率有影响。D 重要的是,重新分类的预后价值在接受肝脏移植的患者中得到验证。E 模型预测与在cHCC-CCA中识别的潜在改变相符合(p = 0.0009,Fisher精确检验)。统计测试为双侧并未调整多重测试。源数据以源数据文件形式提供。
Fig. 4 | Combination of deep-learning heatmaps with spatial transcriptomics unravels the gene expression profile of areas that markedly impact the pre dictions. A Example of a case processed by spatial transcriptomics: the HE section and its corresponding prediction heat map are presented, with the upper left area being considered as ICCA-like. This analysis was repeated independently with
similar results for two other cases, as mentioned in the section “Spatial Transcriptomics Analysis and AI Predictions”. B Predictions matches the in situ gene expression profile with the ICCA like area showing upregulation of biliary/ cholangiocytic genes (EPCAM, HNF1B and KRT7) and downregulation of hepatocytic genes (ALB, FABP1 and APOB). Raw data are available online as described in the “Data Availability” Section.
图4 | 深度学习热图与空间转录组学的结合
揭示了对预测有显著影响的区域的基因表达谱。A 一个通过空间转录组学处理的案例示例:展示了HE切片及其对应的预测热图,其中左上区域被视为类ICCA。此分析在“空间转录组学分析和AI预测”部分提及的另外两个案例中独立重复,结果相似。B 预测与原位基因表达谱匹配,类ICCA区域显示胆道/胆管细胞基因(EPCAM、HNF1B和KRT7)的上调和肝细胞基因(ALB、FABP1和APOB)的下调。原始数据可在线获取,如“数据可用性”部分所述。