基于自动化机器学习构建胆总管结石自发排石预测模型及应用程序

陈健; 夏开建; 高福利; 刘罗杰; 王甘红; 徐晓丹

doi:10.12449/JCH250319

基于自动化机器学习构建胆总管结石自发排石预测模型及应用程序

DOI: 10.12449/JCH250319

1.
常熟市第一人民医院消化内科，江苏常熟 215500
2.
常熟市医学人工智能与大数据重点实验室，江苏常熟 215500
3.
常熟市中医院消化内科，江苏常熟 215500

基金项目:

姑苏卫生人才培养项目 (GSWS2020109);

苏州市第二十三批科技发展计划项目 (SLT2023006);

苏州市临床重点病种诊疗技术专项项目 (LCZX202334);

常熟市科技发展计划项目 (CS202019);

常熟市科技发展计划项目 (CSWS202316)

伦理学声明：本研究方案于2024年8月8日经由常熟市第一人民医院伦理委员会审批，批号：2024L022。

利益冲突声明：本文不存在任何利益冲突。

作者贡献声明：陈健负责课题设计，资料分析，撰写论文；高福利、刘罗杰、王甘红参与收集数据，修改论文；夏开建负责代码解释及报错解决；徐晓丹负责拟定写作思路，指导撰写文章并最后定稿。

详细信息

通信作者:
徐晓丹， xxddocter@gmail.com （ORCID： 0009-0005-1947-3339）

计量
- 文章访问数: 2133
- HTML全文浏览量: 285
- PDF下载量: 60
- 被引次数: 0
出版历程
- 收稿日期: 2024-08-16
- 录用日期: 2024-09-06
- 出版日期: 2025-03-25

Development of a predictive model and application for spontaneous passage of common bile duct stones based on automated machine learning

1.
Department of Gastroenterology，Changshu First People’s Hospital，Changshu，Jiangsu 215500，China
2.
Changshu Key Laboratory of Medical Artificial Intelligence and Big Data，Changshu，Jiangsu 215500，China
3.
Department of Gastroenterology，Changshu Traditional Chinese Medicine Hospital，Changshu，Jiangsu 215500，China

Research funding:

Gusu Health Talent Training Project (GSWS2020109);

Suzhou 23rd Science and Technology Development Plan Project (SLT2023006);

Suzhou Clinical Key Disease Diagnosis and Treatment Technology Special Project (LCZX202334);

Changshu Science and Technology Development Plan Projects (CS202019);

Changshu Science and Technology Development Plan Projects (CSWS202316)

More Information

Corresponding author: XU Xiaodan， xxddocter@gmail.com （ORCID： 0009-0005-1947-3339）

摘要

摘要: 目的鉴于胆总管结石患者治疗决策的复杂性，本研究利用自动化机器学习算法，开发一款能够预测胆总管结石患者自发排石的预测模型及应用程序，从而减少非必要内镜逆行胰胆管造影（ERCP）。方法回顾性收集2022年1月—2024年6月通过影像学手段明确诊断胆总管结石后拟行ERCP取石的患者数据，数据来自常熟市第一人民医院（数据集1）和常熟市中医院（数据集2），共835例。数据集1用于机器学习模型训练、内部验证和开发应用程序，数据集2用于外部测试。纳入22个潜在预测变量，用于构建和内部验证LASSO回归模型及自动化机器学习模型。通过受试者操作特征曲线下面积（AUC）、敏感度、特异度、准确率等评估模型性能，选取最佳模型。使用特征重要性图、力图和SHAP图对模型进行解释。利用Python Dash库和最佳模型构建Web应用程序，在数据集2上进行外部测试。使用Kolmogorov-Smirnov检验确定数据是否符合正态分布；对于不符合正态分布的连续变量，使用Mann-Whitney U检验进行2组间比较；分类变量通过χ²检验或Fisher精确检验来分析组间差异。结果纳入835例患者中，152例（18.20%）出现自发排石。在训练集（n=588）和验证集（n=171）中，LASSO模型的AUC分别为0.875、0.864，重要性排名前5的预测因素为单发胆总管结石、胆总管不扩张、胆总管结石直径、血清ALP降低和GGT降低。通过自动化机器学习构建了55个模型，其中梯度提升机（GBM）表现最佳，其AUC为0.891，95%CI为0.859～0.927，优于极端随机树（XRT）、深度学习（DL）、广义线性模型（GLM）和分布式随机森林（DRF）模型。在测试集（n=76）中，GBM模型的预测准确率、敏感度和特异度分别为0.855、0.846和0.857。变量重要性分析显示，单发胆总管结石、胆总管不扩张、胆总管结石直径<8 mm、血清ALP降低和GGT降低这5个因素对预测自发排石具有重要影响。基于GBM模型的SHAP图分析显示，当患者出现单发胆总管结石、胆总管不扩张、胆总管结石直径<8 mm、血清ALP及GGT降低时，出现自发性排石的概率明显增加。结论基于自动化机器学习算法构建的GBM模型及应用程序，在预测胆总管结石患者自发排石方面展现出良好的预测性能和使用便捷性。该应用程序能够帮助避免非必要的ERCP，从而降低手术风险和医保支出。
- 胆总管结石病 /
- 胰胆管造影术，内窥镜逆行 /
- 机器学习 /
- 预测模型
Abstract: Objective To develop a predictive model and application for spontaneous passage of common bile duct stones using automated machine learning algorithms given the complexity of treatment decision-making for patients with common bile duct stones， and to reduce unnecessary endoscopic retrograde cholangiopancreatography （ERCP） procedures. Methods A retrospective analysis was performed for the data of 835 patients who were scheduled for ERCP after a confirmed diagnosis of common bile duct stones based on imaging techniques in Changshu First People’s Hospital （dataset 1） and Changshu Traditional Chinese Medicine Hospital （dataset 2）. The dataset 1 was used for the training and internal validation of the machine learning model and the development of an application， and the dataset 2 was used for external testing. A total of 22 potential predictive variables were included for the establishment and internal validation of the LASSO regression model and various automated machine learning models. The area under the receiver operating characteristic curve （AUC）， sensitivity， specificity， and accuracy were used to assess the performance of models and identify the best model. Feature importance plots， force plots， and SHAP plots were used to interpret the model. The Python Dash library and the best model were used to develop a web application， and external testing was conducted using the dataset 2. The Kolmogorov-Smirnov test was used to examine whether the data were normally distributed， and the Mann-Whitney U test was used for comparison between two groups， while the chi-square test or the Fisher’s exact test was used for comparison of categorical data between groups. Results Among the 835 patients included in the study， 152 （18.20%） experienced spontaneous stone passage. The LASSO model achieved an AUC of 0.875 in the training set （n=588） and 0.864 in the validation set （n=171）， and the top five predictive factors in terms of importance were solitary common bile duct stones， non-dilated common bile duct， diameter of common bile duct stones， a reduction in serum alkaline phosphatase （ALP）， and a reduction in gamma-glutamyl transpeptidase （GGT）. A total of 55 models were established using automated machine learning， among which the gradient boosting machine （GBM） model had the best performance， with an AUC of 0.891 （95% confidence interval： 0.859‍ ‍—‍ ‍0.927）， outperforming the extreme randomized tree mode， the deep learning model， the generalized linear model， and the distributed random forest model. The GBM model had an accuracy of 0.855， a sensitivity of 0.846， and a specificity of 0.857 in the test set （n=76）. The variable importance analysis showed that five factors had important influence on the prediction of spontaneous stone passage， i.e.， were solitary common bile duct stones， non-dilated common bile duct， a stone diameter of <8 mm， a reduction in serum ALP， and a reduction in GGT. The SHAP analysis of the GBM model showed a significant increase in the probability of spontaneous stone passage in patients with solitary common bile duct stones， non-dilated common bile duct， a stone diameter of <8 mm， and a reduction in serum ALP or GGT. Conclusion The GBM model and application developed using automated machine learning algorithms exhibit excellent predictive performance and user-friendliness in predicting spontaneous stone passage in patients with common bile duct stones. This application can help avoid unnecessary ERCP procedures， thereby reducing surgical risks and healthcare costs.
- Choledocholithiasis /
- Cholangiopancreatography， Endoscopic Retrograde /
- Machine Learning /
- Predictive Model

HTML全文

图 1 研究流程图

Figure 1. Research flowchart

下载: 全尺寸图片幻灯片

注： a，回归系数。随着λ值的增加，系数的绝对值减小；b，通过10倍交叉验证确定LASSO回归分析中最优λ值。续表1

图 2 基于LASSO回归的胆总管结石患者自发排石预测因子的惩罚图

Table 1　Continued

Figure 2. Penalty plot of predictors for spontaneous stone passage in patients with common bile duct stones based on LASSO regression

下载: 全尺寸图片幻灯片

注： a，模型在训练集中的校准曲线；b，模型在验证集中的校准曲线。

图 3 LASSO回归模型在训练集和验证集中的校准曲线

Figure 3. Calibration curves of the LASSO regression model in the training and validation sets

下载: 全尺寸图片幻灯片

注： a，模型在训练集中的ROC曲线；b，模型在验证集中的ROC曲线。

图 4 LASSO回归模型在训练集和验证集的ROC曲线

Figure 4. ROC curves of the LASSO regression model in the training and validation sets

下载: 全尺寸图片幻灯片

图 5 不同机器学习模型ROC曲线对比

Figure 5. Comparison of ROC curves among different machine learning models

下载: 全尺寸图片幻灯片

注： a，变量重要性贡献图；b，学习曲线图。

图 6 GBM模型在验证集中的变量重要性和学习曲线

Figure 6. Variable importance and learning curve of the GBM model in the validation set

下载: 全尺寸图片幻灯片

图 7 基于GBM模型的Web应用用户界面

Figure 7. User interface of the Web application based on the GBM model

下载: 全尺寸图片幻灯片

图 8 测试集中GBM模型SHAP特征分析

Figure 8. SHAP feature analysis of the GBM model in the test set

下载: 全尺寸图片幻灯片

注： a，预测为自发排石的概率为72%；b，预测为自发排石的概率为9%。CBDSd=1，胆总管结石直径≤5 mm；SCBDS=1，单发胆总管结石；CBD.Dilation=0，胆总管扩张；IE ERCP.interval=2，影像学检查与ERCP间隔2天；ICS=0，临床症状未改善；Distal.CBDSs=0，非远端胆总管结石；sex=0，女。

图 9 测试集中GBM模型的力图分析

Figure 9. Force plot analysis of the GBM model in the test set

下载: 全尺寸图片幻灯片

表 1 训练集与验证集基线资料比较

Table 1. Comparison of baseline data between training and validation sets

变量	训练集（n=588）	验证集（n=171）	统计值	P值
性别［例（%）］			χ²=0.110	0.740
女	327（55.6）	92（53.8）
男	261（44.4）	79（46.2）
年龄［例（%）］			χ²=0.378	0.539
<60 岁	293（49.8）	80（46.8）
≥60 岁	295（50.2）	91（53.2）
BMI［例（%）］			χ²=2.308	0.315
<18.5 kg/m²	117（19.9）	28（16.4）
18.5～24.0 kg/m²	262（44.6）	87（50.9）
>24.0 kg/m²	209（35.5）	56（32.7）
静息收缩压［例（%）］			χ²=1.918	0.166
<140 mmHg	436（74.1）	117（68.4）
≥140 mmHg	152（25.9）	54（31.6）
远端胆总管结石［例（%）］			χ²=1.212	0.271
否	488（83.0）	135（78.9）
是	100（17.0）	36（21.1）
ALP降低［例（%）］			χ²=0.786	0.375
否	505（85.9）	152（88.9）
是	83（14.1）	19（11.1）
GGT降低［例（%）］			χ²=0.001	0.979
否	489（83.2）	143（83.6）
是	99（16.8）	28（16.4）
胆总管结石直径［例（%）］			χ²=0.028	0.866
≥8 mm	514（87.4）	148（86.5）
<8 mm	74（12.6）	23（13.5）
单发胆总管结石［例（%）］			χ²=0.109	0.742
否	368（62.6）	104（60.8）
是	220（37.4）	67（39.2）
胆总管扩张［例（%）］			χ²=0.023	0.878
否	497（84.5）	146（85.4）
是	91（15.5）	25（14.6）
术前应用抗生素［例（%）］			χ²=2.661	0.103
否	454（77.2）	121（70.8）
是	134（22.8）	50（29.2）
术前应用解痉药［例（%）］			χ²=1.760	0.185
否	402（68.4）	107（62.6）
是	186（31.6）	64（37.4）
临床症状改善［例（%）］			χ²=0.363	0.547
否	596（89.1）	163（87.2）
是	73（10.9）	24（12.8）
淀粉酶［例（%）］			χ²=0.001	0.995
<300 U/L	415（70.6）	120（70.2）
≥300 U/L	173（29.4）	51（29.8）
影像检查与ERCP间隔（d）	4.00（3.00～5.00）	4.00（2.00～6.00）	Z=0.089	0.960
WBC（×10⁹/L）	6.28（4.80～8.51）	6.03（4.58～7.98）	Z=0.843	0.407
CRP（mg/L）	12.26（2.71～47.23）	7.39（2.52～53.09）	Z=0.353	0.727
TBil（μmol/L）	28.15（17.88～56.50）	24.40（16.00～49.90）	Z=1.489	0.080
DBil（μmol/L）	12.80（6.90～34.07）	13.70（7.35～39.90）	Z=0.856	0.396
GGT（U/L）	246.45（129.85～439.38）	270.10（154.55～484.40）	Z=1.434	0.115
ALP（U/L）	177.40（122.77～262.00）	168.40（122.70～249.15）	Z=0.984	0.309
ALT（U/L）	91.90（33.58～200.25）	92.50（41.35～217.15）	Z=1.142	0.285
AST（U/L）	61.85（28.90～131.58）	72.80（34.55～141.85）	Z=1.323	0.166

下载: 导出CSV

表 2 验证集中不同机器学习模型性能比较

Table 2. Performance comparison of different machine learning models in the validation set

模型	AUC（95%CI）	敏感度	特异度	准确率	PPV	NPV
GBM	0.891（0.859～0.927）	0.894	0.742	0.888	0.883	0.786
GLM	0.882（0.783～0.889）	0.860	0.742	0.860	0.880	0.680
DL	0.882（0.839～0.912）	0.877	0.742	0.874	0.881	0.729
XRT	0.865（0.841～0.902）	0.837	0.821	0.856	0.895	0.654
DRF	0.864（0.835～0.917）	0.899	0.742	0.893	0.883	0.807

下载: 导出CSV

参考文献(22)

[1]	MANES G, PASPATIS G, AABAKKEN L, et al. Endoscopic management of common bile duct stones: European Society of Gastrointestinal Endoscopy(ESGE) guideline[J]. Endoscopy, 2019, 51( 5): 472- 491. DOI: 10.1055/a-0862-0346.
[2]	ASGE Standards of Practice Committee, BUXBAUM JL, ABBAS FEHMI SM, et al. ASGE guideline on the role of endoscopy in the evaluation and management of choledocholithiasis[J]. Gastrointest Endosc, 2019, 89( 6): 1075- 1105. e 15. DOI: 10.1016/j.gie.2018.10.001.
[3]	group ERCP, Chinese Society of Digestive Endoscopology; group Biliopancreatic, Chinese Association of Gastroenterologist and hepatologist, National Clinical Research Center for Digestive Diseases. Chinese guidelines for ERCP(2018)[J]. J Clin Hepatol, 2018, 34( 12): 2537- 2554. DOI: 10.3969/j.issn.1001-5256.2018.12.009. 中华医学会消化内镜学分会ERCP学组, 中国医师协会消化医师分会胆胰学组, 国家消化系统疾病临床医学研究中心. 中国经内镜逆行胰胆管造影术指南(2018版)[J]. 临床肝胆病杂志, 2018, 34( 12): 2537- 2554. DOI: 10.3969/j.issn.1001-5256.2018.12.009.
[4]	ASGE Standards of Practice Committee, CHANDRASEKHARA V, KHASHAB MA, et al. Adverse events associated with ERCP[J]. Gastrointest Endosc, 2017, 85( 1): 32- 47. DOI: 10.1016/j.gie.2016.06.051.
[5]	Endoscopic Surgery Group, Digestive Endoscopy Branch, Chinese Medical Association, Endoscopic Surgery Expert Working Group, Chinese College of Surgeons, Professional Committee of Pancreatic Disease, Chinese Medical Doctor Association, et al. Guideline for the management of complications of duodenal perforation associated with ERCP in China(2023 edition)[J]. Chin J Dig Surg, 2024, 23( 1): 1- 9. DOI: 10.3760/cma.j.cn115610-20231025-00166. 中华医学会消化内镜学分会内镜外科学组, 中国医师协会外科医师分会内镜外科专家工作组, 中国医师协会胰腺病专业委员会, 等. 中国ERCP致十二指肠穿孔并发症管理指南(2023版)[J]. 中华消化外科杂志, 2024, 23( 1): 1- 9. DOI: 10.3760/cma.j.cn115610-20231025-00166.
[6]	FAN L, FU Y, LIU Y, et al. Research advances in hemorrhage after endoscopic retrograde cholangiopancreatography[J]. J Clin Hepatol, 2023, 39( 10): 2497- 2505. DOI: 10.3969/j.issn.1001-5256.2023.10.032. 范玲, 傅燕, 刘懿, 等. 内镜逆行胰胆管造影术后出血的研究进展[J]. 临床肝胆病杂志, 2023, 39( 10): 2497- 2505. DOI: 10.3969/j.issn.1001-5256.2023.10.032.
[7]	KHAN OA, BALAJI S, BRANAGAN G, et al. Randomized clinical trial of routine on-table cholangiography during laparoscopic cholecystectomy[J]. Br J Surg, 2011, 98( 3): 362- 367. DOI: 10.1002/bjs.7356.
[8]	WANG GH, CHEN J, SHEN ZJ, et al. Establishing and evaluating a risk prediction model for colonoscopy bowel preparation failure based on automated machine learning[J]. China J Endosc, 2024, 30( 5): 36- 47. DOI: 10.12235/E20230422. 王甘红, 陈健, 沈支佳, 等. 基于自动化机器学习建立结肠镜肠道准备失败风险预测模型及评价[J]. 中国内镜杂志, 2024, 30( 5): 36- 47. DOI: 10.12235/E20230422.
[9]	YAN WX, ZHANG PP, WANG WX, et al. Influencing factors of spontaneous passage of common bile duct stones in gallstones patients[J]. J Chin Pract Diagn Ther, 2023, 37( 10): 1025- 1027. DOI: 10.13507/j.issn.1674-3474.2023.10.011. 阎文心, 张平平, 王万祥, 等. 胆总管结石合并胆囊结石患者胆总管自发排石的影响因素分析[J]. 中华实用诊断与治疗杂志, 2023, 37( 10): 1025- 1027. DOI: 10.13507/j.issn.1674-3474.2023.10.011.
[10]	XU ZW, MEI Q, HONG JL, et al. Application of endoscopic ultrasonography combined with ALP and GGT in the diagnosis of spontaneous migration of choledocholithiasis[J]. J Hepatobiliary Surg, 2023, 31( 2): 106- 110. DOI: 10.3969/j.issn.1006-4761.2023.02.010. 徐张巍, 梅俏, 洪江龙, 等. 超声内镜联合ALP、GGT在诊断胆总管结石自发排石中的应用研究[J]. 肝胆外科杂志, 2023, 31( 2): 106- 110. DOI: 10.3969/j.issn.1006-4761.2023.02.010.
[11]	NOHARA Y, MATSUMOTO K, SOEJIMA H, et al. Explanation of machine learning models using shapley additive explanation and application for real data in hospital[J]. Comput Methods Programs Biomed, 2022, 214: 106584. DOI: 10.1016/j.cmpb.2021.106584.
[12]	FAHMY AS, CSECS I, ARAFATI A, et al. An explainable machine learning approach reveals prognostic significance of right ventricular dysfunction in nonischemic cardiomyopathy[J]. JACC Cardiovasc Imaging, 2022, 15( 5): 766- 779. DOI: 10.1016/j.jcmg.2021.11.029.
[13]	ANDREOZZI P, de NUCCI G, DEVANI M, et al. The high rate of spontaneous migration of small size common bile duct stones may allow a significant reduction in unnecessary ERCP and related complications: Results of a retrospective, multicenter study[J]. Surg Endosc, 2022, 36( 5): 3542- 3548. DOI: 10.1007/s00464-021-08676-8.
[14]	FROSSARD JL, HADENGUE A, AMOUYAL G, et al. Choledocholithiasis: A prospective study of spontaneous common bile duct stone migration[J]. Gastrointest Endosc, 2000, 51( 2): 175- 179. DOI: 10.1016/s0016-5107(00)70414-7.
[15]	HERBST MK, LI C, BLOMSTROM S. Point-of-care ultrasound assists diagnosis of spontaneously passed common bile duct stone[J]. J Emerg Med, 2021, 60( 4): 517- 519. DOI: 10.1016/j.jemermed.2020.11.008.
[16]	KHOURY T, ADILEH M, IMAM A, et al. Parameters suggesting spontaneous passage of stones from common bile duct: A retrospective study[J]. Can J Gastroenterol Hepatol, 2019, 2019: 5382708. DOI: 10.1155/2019/5382708.
[17]	JIANG L, LIU Z, YU JF, et al. On factors related to spontaneous passage of common bile duct stones leading to unnecessary endoscopic retrograde cholangiopancreatography[J]. Chin J Minim Invasive Surg, 2024, 24( 6): 409- 414. DOI: 10.3969/j.issn.1009-6604.2024.06.002. 姜蕾, 刘振, 于剑锋, 等. 胆总管结石自然排石致非必要治疗性内镜下逆行胰胆管造影的影响因素[J]. 中国微创外科杂志, 2024, 24( 6): 409- 414. DOI: 10.3969/j.issn.1009-6604.2024.06.002.
[18]	LEFEMINE V, MORGAN RJ. Spontaneous passage of common bile duct stones in jaundiced patients[J]. Hepatobiliary Pancreat Dis Int, 2011, 10( 2): 209- 213. DOI: 10.1016/s1499-3872(11)60033-7.
[19]	ZHANG RF, YIN MY, JIANG AQ, et al. Automated machine learning for early prediction of acute kidney injury in acute pancreatitis[J]. BMC Med Inform Decis Mak, 2024, 24( 1): 16. DOI: 10.1186/s12911-024-02414-5.
[20]	LIU LJ, ZHANG RF, SHI DT, et al. Automated machine learning to predict the difficulty for endoscopic resection of gastric gastrointestinal stromal tumor[J]. Front Oncol, 2023, 13: 1190987. DOI: 10.3389/fonc.2023.1190987.
[21]	LIU LJ, ZHANG RF, SHI Y, et al. Automated machine learning for predicting liver metastasis in patients with gastrointestinal stromal tumor: A SEER-based analysis[J]. Sci Rep, 2024, 14( 1): 12415. DOI: 10.1038/s41598-024-62311-9.
[22]	MURPHREE DH, QUEST DJ, ALLEN RM, et al. Deploying predictive models in A healthcare environment-an open source approach[C]// 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society(EMBC). Honolulu, HI, USA, 2018: 6112- 6116. DOI: 10.1109/EMBC.2018.8513689.