文献速递:深度学习疾病预后--使用深度学习对数字病理图像进行胃癌的准确诊断和预后预测:一项回顾性多中心研究

Title 

题目

Accurate diagnosis and prognosis prediction of gastric cancer using deep learning on digital pathological images: A retrospective multicentre study

使用深度学习对数字病理图像进行胃癌的准确诊断和预后预测:一项回顾性多中心研究

01

文献速递介绍

Gastric cancer (GC) is the fifth most common type of malignant disease, and it ranks as the third leading cause of cancer-related deaths worldwide . For patients with early GC, the 5-year survival

rate can exceed 90% . However, approximately half of patients with GC already proceed the advanced stage at the time of diagnosis, with the 5-year survival rate dropping below 30%. To reduce the

mortality of GC, early detection and appropriate treatment are cru cial, and precise and efficient pathology services are indispensable to realize this goal. Pathological evaluation remains the gold standard for the diagno sis of GC. Conventionally carried out by pathologists, this method is labor-intensive, tedious, and time-consuming. A severe shortage of pathologists and a heavy workload of diagnosis are widespread prob lems globally, which negatively affect the diagnostic accuracy.

Accordingly, it is necessary to design a new method to conveniently and accurately diagnose GC using pathological pictures. Surgery is the main treatment for GC, followed by adjuvant treat ments including chemoradiotherapy and molecular targeted therapy

胃癌(GC)是全球第五大常见的恶性疾病类型,同时也是导致癌症相关死亡的第三大原因。对于早期GC患者,5年生存率可以超过90%。然而,约有一半的GC患者在诊断时已经进展到晚期阶段,5年生存率下降到30%以下。为了降低GC的死亡率,早期发现和适当治疗至关重要,精确高效的病理服务是实现这一目标不可或缺的。

病理学评估仍然是GC诊断的黄金标准。这一方法传统上由病理学家执行,是劳动密集型的、乏味的且耗时的。全球范围内,病理学家严重短缺和诊断工作量重是普遍存在的问题,这些问题负面影响了诊断的准确性。

因此,设计一种新方法,利用病理图片方便且准确地诊断GC变得十分必要。

手术是治疗GC的主要方法,随后是辅助治疗,包括化学放疗和分子靶向治疗。

Results

结果

Patient characteristics A total of 871 GC patients were initially screened from the RHWU cohort, and 588 with tumour tissue blocks were eligible for the study.

There were 449 GC patients with digital H&E-stained pathological images were eligible for this study in the TCGA cohort and 91 in the NHGRP cohort. A total of 1276 images from the RHWU cohort and 1057 images from the TCGA cohort were obtained for the develop ment of the GastroMIL model. Through data augmentation, 3221 pic tures (malignant: normal = 1574: 1647) were finally enrolled in the GastroMIL model and 70% (N = 2261) were randomly assigned to the training set while the remaining 30% (N = 960) were included in the internal validation set. 175 pictures from the independent NHGRP cohort were used as the external validation set. The detailed data dis tribution was shown in Supplementary Table 1.

患者特征 最初从RHWU队列中筛选了871名GC患者,其中588名具有肿瘤组织块的患者符合研究资格。在TCGA队列中,有449名具有数字化H&E染色病理图像的GC患者符合本研究资格,在NHGRP队列中有91名。从RHWU队列中获得了1276张图像,从TCGA队列中获得了1057张图像,用于开发GastroMIL模型。通过数据增强,最终有3221张图片(恶性:正常 = 1574:1647)被纳入GastroMIL模型,其中70%(N = 2261)被随机分配到训练集,剩余30%(N = 960)被纳入内部验证集。独立的NHGRP队列中的175张图片被用作外部验证集。详细的数据分布显示在补充表1中。

Methods

方法

2333 hematoxylin and eosin-stained pathological pictures of 1037 GC patients were collected from two cohorts to develop our algorithms, Renmin Hospital of Wuhan University (RHWU) and the Cancer Genome Atlas (TCGA). Additionally, we gained 175 digital pictures of 91 GC patients from National Human Genetic Resources Sharing Service Platform (NHGRP), served as the independent external validation set. Two models were developed using artificial intelligence (AI), one named GastroMIL for diagnosing GC, and the other named MIL-GC for predicting outcome of GC.

我们从两个队列中收集了1037名GC患者的2333张苏木精和伊红染色的病理图片,用于开发我们的算法,这两个队列分别是武汉大学人民医院(RHWU)和癌症基因组图谱(TCGA)。此外,我们还从国家人类遗传资源共享服务平台(NHGRP)获得了91名GC患者的175张数字图片,作为独立的外部验证集。使用人工智能(AI)开发了两个模型,一个名为GastroMIL,用于诊断GC,另一个名为MIL-GC,用于预测GC的结果。

Fig

图片

Fig. 1. Flow chart of the developed models. The framework of GastroMIL is shown in a-b, and that of MIL-GC is shown in a-c. Pathological images are input and tiles with 224 £ 224 pixels of each image are generated (a). Through CNN classifier of the MIL model, the probability of these tiles being malignant is output. Heat map visualizes ROIs identified by the model. Feature vectors with dimension 608 of the most suspicious tiles are extracted. Feature vectors of the K most suspicious tiles are input to the second layer of MIL and aggre gated by RNN, and then the final diagnosis prediction of the input image is generated. In this study we took K as 32 (b). Feature vectors of the most S suspicious tiles are input to the prognosis model (in this study S = 128). In the MIL-GC model, each feature vector yields a probability value through a MLP algorithm. Probability values of the 128 most suspicious tiles of the input picture were merged to generate an average value as the output risk score (c). CNN, convolutional neural network; RNN, recurrent neural network; MIL, multiple instance learning; MLP, multilayer perceptron; ROI, region of interest.

图1. 开发模型的流程图。GastroMIL的框架显示在a-b中,MIL-GC的框架显示在a-c中。输入病理图像并生成每个图像的224×224像素的图块(a)。通过MIL模型的CNN分类器,输出这些图块为恶性的概率。热图可视化模型识别的感兴趣区域(ROI)。提取最可疑图块的608维特征向量。输入K个最可疑图块的特征向量到MIL的第二层,并通过RNN聚合,然后生成输入图像的最终诊断预测。在本研究中,我们取K为32(b)。将最可疑的S个图块的特征向量输入到预后模型中(在本研究中S=128)。在MIL-GC模型中,每个特征向量通过MLP算法产生一个概率值。输入图片的128个最可疑图块的概率值合并生成一个平均值作为输出风险得分(c)。CNN,卷积神经网络;RNN,循环神经网络;MIL,多实例学习;MLP,多层感知器;ROI,感兴趣区域。

图片

Fig. 2. Diagnostic abilities of GastroMIL at different magnification in the training and internal validation sets. a-c, ROC curves in the training set when images at 5 £, 10 £ and20£ magnification, respectively; e-g, ROC curves in the internal validation set when images at 5 £, 10 £ and 20 £ magnification, respectively. The AUC, Acc, Sen and Spe of the training and internal validation sets were exhibited in d and h, respectively. ROC, receiver operating characteristic; AUC, area under the curve; Acc, accuracy; Sen, sensitivity; Spe,specificity.

图2. GastroMIL在不同放大倍数下在训练集和内部验证集中的诊断能力。a-c,分别为训练集中5倍、10倍和20倍放大时的ROC曲线;e-g,分别为内部验证集中5倍、10倍和20倍放大时的ROC曲线。训练集和内部验证集的AUC、准确率、敏感性和特异性分别在d和h中展示。ROC,接收者操作特征;AUC,曲线下面积;准确率;敏感性;特异性。

图片

Fig. 3. Heat maps of the RHWU cohort. a-d, pathological images and corresponding heat maps with pathological TNM stage I, II, III, and IV from the RHWU cohort, respectively. The actual tumor regions annotated by expert pathologists were shown with yellow lines.

图3. RHWU队列的热图。a-d,分别是RHWU队列中病理学TNM分期I、II、III和IV的病理图片及其对应的热图。由专家病理学家标注的实际肿瘤区域用黄线显示。

图片

Fig. 4. Prognostic significance of the risk score generated by MIL-GC in the internal validation set. HRs for prediction of survival by the MIL-GC model and other clinicopathological indexes based on univariate (a) and multivariate (b) analyses. The output score was converted into a binary score (high or low risk), using the median value of the training set as a threshold. KM survival curves for the internal validation set (c) and some other subgroups: age  60 (d); age > 60 (e); histologic grade 1-2 (f); histologic grade 3-4 (g); pT stage 3-4 (h); pN stage 0-1 (i); pN stage 2-3 (j); pTNM stage 1-2 (k) and pTNM stage 3-4 (l). , P < 0.0001; **, P <0.01; *,P < 0.05. The P-value of Kaplan-Meier survival curve was evaluated by Log-Rank test. The P-value of HR was calculated by Cox analyse.

图4. MIL-GC生成的风险得分在内部验证集中的预后意义。基于单变量(a)和多变量(b)分析,MIL-GC模型和其他临床病理指标预测生存的风险比(HRs)。输出得分被转换为二进制得分(高风险或低风险),使用训练集的中位数作为阈值。内部验证集(c)以及一些其他子组的KM生存曲线:年龄≤60(d);年龄>60(e);组织学等级1-2(f);组织学等级3-4(g);pT阶段3-4(h);pN阶段0-1(i);pN阶段2-3(j);pTNM阶段1-2(k)和pTNM阶段3-4(l)。,P < 0.0001;*,P < 0.01;,P < 0.05。Kaplan-Meier生存曲线的P值通过Log-Rank测试评估。HR的P值通过Cox分析计算。

图片

Fig. 5. Predicting diagnosis and prognostic performance in the external validation set. ROC curve (a) and HRs based on univariate (b) and multivariate (d) analyses are exhibited. KM survival curves for the external validation set (c) and some other subgroups: age > 60 (e); tumour size  5 (f); histologic grade 3 (g); pT stage 3 (h); pN stage 0 (i); pN stage 3 (j); pM stage 0 (k); pTNM stage 2 (l) and pTNM stage 3 (m). *, P < 0.001; , P < 0.01; *, P < 0.05. ROC, receiver operating characteristic; AUC, area under the curve. The P-value of Kaplan Meier survival curve was evaluated by Log-Rank test. The P-value of HR was calculated by Cox analyse.

图5. 在外部验证集中预测诊断和预后性能。展示了ROC曲线(a)以及基于单变量(b)和多变量(d)分析的HRs。外部验证集(c)以及一些其他子组的KM生存曲线:年龄>60(e);肿瘤大小≤5(f);组织学等级3(g);pT阶段3(h);pN阶段0(i);pN阶段3(j);pM阶段0(k);pTNM阶段2(l)和pTNM阶段3(m)。, P < 0.001;, P < 0.01;, P < 0.05。ROC,接收者操作特征;AUC,曲线下面积。Kaplan Meier生存曲线的P值通过Log-Rank测试评估。HR的P值通过Cox分析计算。

图片

Fig. 6. Representative predictive tiles produced by our model. These tiles were of obvious tumour heterogeneity, including necrosis (a), nerve invasion (b), signet ring cell (c), intravasated cancer cells (d), muscularis propria invasion (e), and mucous secretion (f).

图6. 我们模型生成的代表性预测图块。这些图块展示了明显的肿瘤异质性,包括坏死(a)、神经侵犯(b)、印戒细胞(c)、癌细胞内脉侵犯(d)、肌层侵犯(e)和粘液分泌(f)。

Table

图片

Table 1 Baseline characteristics in the prognostic model (MIL-GC).

表1 预后模型(MIL-GC)中的基线特征。

图片

Table 2Accuracy, sensitivity and specificity of the diagnostic model (GastroMIL) and human pathologists.

表2诊断模型(GastroMIL)和人类病理学家的准确性、敏感性和特异性。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/721545.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

OpenMMlab AI实战营第四期培训

OpenMMlab AI实战营第四期培训 OpenMMlab实战营第四次课2023.2.6学习参考一、什么是目标检测1.目标检测下游视觉任务2.图像分类 v.s. 目标检测 二、目标检测实现1.滑窗 Sliding Window2.滑窗的效率问题3.改进思路&#xff08;1&#xff09;消除滑窗中的重复计算&#xff08;2&a…

x6.js 流程图绘制笔记,常用函数

官方参考网站如下&#xff1a;https://antv-x6.gitee.io/zh/docs/tutorial/about 安装x6 输入以下命令 npm install antv/x6 --save 引用插件代码如下&#xff1a; import { Graph } from antv/x6; 创建绘制区域 this.guiX6 new Graph({container: document.querySelect…

【MGR】MySQL Group Replication 背景

目录 17.1 Group Replication Background 17.1.1 Replication Technologies 17.1.1.1 Primary-Secondary Replication 17.1.1.2 Group Replication 17.1.2 Group Replication Use Cases 17.1.2.1 Examples of Use Case Scenarios 17.1.3 Group Replication Details 17.1…

EdgeX Foundry 安全模式安装部署

文章目录 一、安装准备1.官方文档2. 克隆服务器3.安装 Docker4.安装 docker-compose 二、安装部署1.docker-comepse2.启动 EdgeX Foundry3.访问 UI3.1. consul3.2. EdgeX Console EdgeX Foundry # EdgeX Foundryhttps://iothub.org.cn/docs/edgex/ https://iothub.org.cn/docs…

Java IO流详解(史上最全18个案例代码)

每文一句 每想拥抱你一次&#xff0c;天空飘落一片雪&#xff0c;至此雪花拥抱撒哈拉&#xff01; —荷西 一、IO流开篇 1. 概念&#xff1a; IO&#xff08;Input/Output&#xff09;流是Java中用于处理输入和输出数据的机制。它允许程序与外部设备&#xff08;如文件、网络…

基于springboot的新闻稿件管理系统论文

新闻稿件管理系统 摘要 随着信息技术在管理上越来越深入而广泛的应用&#xff0c;管理信息系统的实施在技术上已逐步成熟。本文介绍了新闻稿件管理系统的开发全过程。通过分析新闻稿件管理系统管理的不足&#xff0c;创建了一个计算机管理新闻稿件管理系统的方案。文章介绍了新…

适用于ZigBee应用的JN5168/001K、JN5188HN、JN5188THN/001Z、JN5189THN超低功耗射频微控制器MCU

一、JN5168/001K 适用于ZigBee应用的超低功耗、高性能无线微控制器 JN5168是超低功耗、高性能无线微控制器&#xff0c;适用于ZigBee应用&#xff0c;它具有256kB嵌入式闪存、32 kB RAM&#xff0c;无需外部存储器即可进行OTA升级。32位RISC处理器可通过不同宽度指令、多级指令…

稀碎从零算法笔记Day5-LeetCode:多数元素

题型&#xff1a;数组、计数、排序、STL函数、查找众数 链接&#xff1a;169. 多数元素 - 力扣&#xff08;LeetCode&#xff09; 来源&#xff1a;LeetCode 著作权归作者所有。商业转载请联系作者获得授权&#xff0c;非商业转载请注明出处。 题目描述 给定一个大小为 n …

pytorch图像显示色彩不对

一、背景 对一张人脸进行卷积&#xff0c;发现图像显示不对。180x180的图&#xff0c;3x3的卷积核&#xff0c;按理说卷出来应该与原图差别不大&#xff0c;但出来的图像很奇怪。 从左至右依次为&#xff1a;原图、空洞卷积图、正常卷积图。 事后分析&#xff0c;上米娜的图像…

IS-IS网络收敛

IPV6是网络学习中最重要的内容之一&#xff0c;本文将从IPv6的基本结构、地址类型等方面详细介绍IPv6的重点学习内容。 想要更多网工专业学习资料&#xff0c;可直接找我领取。&#xff08;文末领取&#xff09; 为了提高IS-IS网络的收敛&#xff0c;有快速收敛和按优先级收敛…

TikTok黑屏怎么办?快来试试这5个方法!

当今社交媒体的热潮中&#xff0c;TikTok跨境电商占据了重要的一席之地。然而&#xff0c;频繁的黑屏、app打开没有内容显示却成了许多用户的头疼问题。如果你也正在寻找TikTok黑屏的解决办法&#xff0c;那么本文将为你提供5种可能的解决方案。无论你是在使用TikTok国际版黑屏…

内含教程丨音色克隆模型 GPT-SoVITS,5 秒语音就能克隆出相似度 95% 的声音

「语音」是人类接触 AI 的「早教技术」&#xff0c;同时也是最早一批走出实验室&#xff0c;走进千家万户的 AI 技术。最初&#xff0c;人们针对智能语音的研究主要集中在语音识别上&#xff0c;即让机器听懂人类语言。 最早的基于电子计算机的语音识别系统是由 AT&T 贝尔实…

c++的队列的用法

基本介绍 c的队列就是std::queue。 需要包含的头文件&#xff1a; #include<queue>queue就是先进先出队列 queue,就是队列&#xff0c;队列是一种容器适配器&#xff0c;专门设计用于在FIFO上下文中操作(先进先出)&#xff0c;其中将元素插入容器的一端并从另一端提…

基于“xxx” Androidx平台的驱动及系统开发 之 触摸板篇

目录 一、基于全志 A133 Android10平台&#xff0c;适配1366x768 - ilitek2511触摸1、原理图分析2、驱动移植与适配3、补丁和资源文件 二、基于瑞芯微 RK3566 Android11平台&#xff0c;适配GT9XX触摸1、原理图分析2、补丁及资源文件 三、遇到的问题与解决1、基于amlogic Andro…

【c++】 string类的模拟实现

1.浅拷贝 浅拷贝&#xff1a;也称位拷贝&#xff0c;编译器只是将对象中的值拷贝过来。如果对象中管理资源&#xff0c;最后就会导致多个对象共享同一份资源&#xff0c;当一个对象销毁时就会将该资源释放掉&#xff0c;而此时另一些对象不知道该资源已经被释放&#xff0c;以…

【C++基础】STL容器面试题分享||上篇

&#x1f308;欢迎来到C基础专栏 &#x1f64b;&#x1f3fe;‍♀️作者介绍&#xff1a;前PLA队员 目前是一名普通本科大三的软件工程专业学生 &#x1f30f;IP坐标&#xff1a;湖北武汉 &#x1f349; 目前技术栈&#xff1a;C/C STL 1.请说说 STL 的基本组成部分2.详细的说&…

xss.haozi.me:0x03及04

这里有一个正则所以&#xff08;&#xff09;要用到实体编码 <a href"javascript:alert1">cc</a> 03 04都一样

密码安全:保护你的数据不被入侵的重要性

title: 密码安全&#xff1a;保护你的数据不被入侵的重要性 date: 2024/3/5 17:54:56 updated: 2024/3/5 17:54:56 tags: 密码安全个人隐私保护身份盗窃防护金融损失防范弱密码危害安全密码创建双因素认证 在数字时代&#xff0c;密码安全是保护个人和机构数据的关键。然而&am…

如何做代币分析:以 INJ 币为例

如何做代币分析&#xff1a;以 INJ 币为例 作者&#xff1a; lesleyfootprint.network 编译&#xff1a;cicifootprint.network 数据源&#xff1a;INJ 代币仪表板 &#xff08;仅包括以太坊数据&#xff09; 在加密货币和数字资产领域&#xff0c;代币分析起着至关重要的作…

Linux——自写一个简易的shell

目录 前言 一、打印提示信息 二、分割字符串 三、替换程序 前言 之前学习了很多进程相关的知识&#xff0c;包括环境变量、进程的创建与退出、进程等待、进程替换。现在可以用所学的作一个小总结&#xff0c;手撕一个shell解释器&#xff0c;大致的思路是先通过环境变量获…