Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

The Benefit of Artificial Intelligence-Based Diagnosis in Gastroenterology and Hepatology Is Highly Variable: A Diagnostic Need and Burden Analysis

2023·2 Zitationen·GastroenterologyOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2023

Jahr

Abstract

Artificial intelligence (AI) for diagnostic purposes represents a potential technology breakthrough in medicine. Its diagnostic application within gastroenterology and hepatology conditions has grown quickly and robustly, with its first United States Food and Drug Administration authorization for use in 2021.1U.S. Food & Drug Administrationhttps://www.fda.gov/news-events/press-announcements/fda-authorizes-marketing-first-device-uses-artificial-intelligence-help-detect-potential-signs-colonDate accessed: April 9, 2021Google Scholar As research funding and commercial development continue at a fast pace,2Bohr A. et al.Artificial Intelligence in Healthcare. Academic Press, 2020: 25-60Crossref Scopus (227) Google Scholar the question of where AI-based diagnosis can have the biggest impact has not been sufficiently addressed. Although improvement in the diagnosis of any condition has benefits, there are also real costs associated with implementing AI-based solutions3Wolff J. et al.J Med Internet Res. 2020; 22: e16866Crossref PubMed Scopus (75) Google Scholar,4Landi H. Fierce Healthcare. 2020; https://www.fiercehealthcare.com/tech/artificial-intelligence-increasing-patient-access-to-care-but-it-s-also-driving-up-costDate accessed: February 19, 2020Google Scholar and opportunity costs associated with prioritizing funding and development decisions. Our aim was to identify the spectrum of diagnostic gaps within gastroenterology and hepatology conditions and analyze their relationship with major drivers of burden, including prevalence, mortality, and cost to determine where AI-based diagnostics will have the greatest benefit. We also wished to highlight existing examples of AI-guided diagnosis to illustrate the relative state of progress across these conditions. We conducted a survey of 18 gastroenterology and hepatology conditions, identifying their diagnostic characteristics, including sensitivity, specificity, and accuracy (calculated when not available5Alberg A.J. et al.J Gen Intern Med. 2004; 19: 460-464Crossref PubMed Scopus (195) Google Scholar) for the conventional diagnostic approach for each condition. The selection of conditions was based on a literature review of the current burden of gastrointestinal and hepatology diseases in the United States. From there, conditions were included based on the availability of consistent data across prevalence, annual mortality rate (per 100,000), and cost (in $ billions). Given that there is no validated measure that encompasses all of these variables, a constructed metric of composite burden was created and defined as prevalence × mortality rate × cost. Because all individual burden metrics equated higher numbers with greater burden, we believe their product would provide a directionally plausible estimate. In addition, an adjusted composite burden metric was constructed by multiplying the composite burden by a condition’s diagnostic gap, defined as the difference between 100 and the accuracy based on the most common mode of diagnosis. As a result, the adjusted composite burden reflected a weighted measure of diagnostic need as well as composite burden. For each of the burden categories, conditions were subsequently ranked from 1 to 18, with the latter representing greater intensity or severity. Lastly, we used a PubMed search to collect data on diagnostic performance characteristics for existing published AI applications for the 18 condition areas. Meta-analyses that reported values for accuracy were prioritized for inclusion. However, if that was unavailable, the range of estimates was included from systematic reviews and individual studies. Given the broad reliability gap between prospective and retrospective studies, we separately indicated in the figure when values were derived solely from the latter. The rank ordered diagnostic and burden characteristics of the 18 gastroenterology and hepatology conditions are presented in Figure 1. The underlying data supporting the rankings and their references are detailed in Supplementary Table 1. The rank order of conditions varied considerably depending on the metric used to assign priority. For example, the top 3 conditions based on size of diagnostic gap alone were esophageal squamous cell carcinoma, gastric cancer, and reflux/gastrointestinal reflux disease (GERD). The top 3 opportunities when evaluated using individual burden metrics (prevalence, mortality, and cost) were all different conditions. The top conditions by composite burden were nonalcoholic fatty liver disease, cholecystitis, and celiac disease. However, when composite burden was adjusted by diagnostic gap, the order changed to cholecystitis, irritable bowel syndrome, and nonalcoholic fatty liver disease (NAFLD). Colorectal cancer ranked as the eighth highest composite burden and seventh adjusted composite burden. When existing available AI-based diagnostic performance was compared with conventional methods, there was also tremendous heterogeneity as well as greater uncertainty. The net gain or loss in diagnostic accuracy (ie, the difference between AI diagnostic accuracy and conventional diagnostic accuracy) is depicted in Supplementary Figure 1. The availability of meta-analyses that examined AI diagnostic accuracy was low. If a meta-analysis value was not available, systematic reviews or multiple single studies were used. The conditions where net benefit was positive and accuracy interval estimate narrow were esophageal squamous cell carcinoma, gastric cancer, reflux/GERD, NAFLD, and adenomatous polyps. Diagnosis of hepatocellular carcinoma also appeared to benefit, although the interval of this estimate was larger. The conditions where there was a net loss of accuracy and narrow interval estimate included cirrhosis, colorectal cancer, pancreatic cancer, and peptic ulcer disease. The remaining conditions were indeterminate for benefit or loss, due to a broad range of accuracies, suggesting a lack of sufficient study and evidence. Of the top 10 conditions with the highest composite and adjusted composite burden, only NAFLD and GERD had a diagnostic benefit of using AI, with a narrow accuracy interval estimate. These data suggest that implementing AI for diagnosing these conditions could decrease burden with greater confidence. However, if considering where AI may benefit using the largest diagnostic gaps today and a narrow accuracy interval estimate, then applications for diagnosing esophageal squamous cell carcinoma, gastric cancer, and reflux/GERD were the top. This analysis has several limitations, including the selection bias associated with the burden metrics chosen and the decision to weight them equally when creating a composite measure. However, we aimed to select metrics that have been commonly used to characterize disease broadly and wanted to demonstrate the impact on priority when different criteria were used. In addition, there is no universal or validated measure that represents all the variables of interest. Another limitation was depicting the conditions in rank order instead of by their absolute values across each of the metrics. This was deliberately done because appreciating the differences across multiple conditions was challenging when the units of measure across each metric were different and the composite units were not intuitive (eg, % prevalence × deaths per 100,000 × $). Because the goal of the analysis was to understand the high-level relationships across major conditions, we thought this relative approach was more straightforward. The inclusion of retrospective studies for the AI estimates was another limitation. This decreased overall reliability of these estimates and likely led to overestimation of performance. Because AI research is still relatively new, the availability of higher-quality study designs, including meta-analyses, were not available in many areas. As a result, the AI estimates presented provide a representative “snapshot” of the present state of the field and should not be interpreted as definitive. Lastly, the lack of financial data associated with diagnostic methods limited the ability to comment on the relative cost-effectiveness. However, future studies will be able to target conditions based on the existing analysis to deliver a more nuanced perspective. Our analysis demonstrated that diagnostic gaps remain disparate across the field of gastroenterology and hepatology. In addition, we demonstrated how choosing individual metrics of burden can alter which conditions are prioritized, leading to potential bias. We believe these findings support early adoption of AI diagnosis for NAFLD and reflux/GERD, where net benefit was positive, burden was high, and the accuracy interval estimate was narrow, followed by the conditions where net benefit was positive, the accuracy interval estimate was narrow, but burden was low. These recommendations should also be balanced by cost considerations. In addition, given large number of studies with lower-quality study design, we believe that much greater investigation is needed, especially for the conditions where net benefit was positive and accuracy intervals were wide and where net benefit or loss was indeterminate. These data also support not using AI for colorectal cancer alone, pancreatic cancer, and peptic ulcer disease. Note, this analysis did not examine AI’s ability diagnose conditions at an earlier stage/severity or its ability to predict the development, treatment, or outcome of these conditions, which are all areas of active investigation. The financial reimbursement opportunity and ease of technical implementation associated with an AI-based diagnostic approach were intentionally not chosen as factors in this analysis to highlight the areas of greatest clinical need. As a result, we believe that this analysis provides a comprehensive and clinical needs-driven perspective on where implementation and further research should be considered for AI-based diagnosis. In addition, it highlights the need to consider the multiple dimensions of burden when determining priorities. Although we cannot escape the financial implications of new technology, we believe that a patient- and disease-centric approach will ultimately have the greatest impact on the patients we serve. Supplementary Table 1Burden and Diagnostic Performance Characteristics by ConditionConditionPrevalence (%)Annual mortality (per 100,000)Annual cost ($ billion)Common diagnostic approachConventional diagnostic accuracy (%)Source type for conventional diagnosisAI diagnostic accuracy (%)Source type for AI diagnosisReflux/GERD22.950.2317.5Clinical history73.03Prospective systematic review99.00Subgroup of mixed meta-analysisColorectal cancer0.4213.414.1Colonoscopy86.04Guideline82.62Subgroup of mixed meta-analysisIrritable bowel syndrome12.509401.05Rome III78.63Prospective meta-analysis51.25–98.50Mixed systematic reviewNonalcoholic fatty liver disease22.8514.9100Ultrasound93.00Mixed meta-analysis98.00Mixed meta-analysisAcute pancreatitis0.040.92.15CT74.00Mixed meta-analysis70.00–89.303 studiesHepatocellular carcinoma0.036.60.092CT80.00Mixed meta-analysis85.00–98.60Retrospective systematic reviewGastric cancer0.042.91.62Endoscopy62.00Prospective meta-analysis93.00Retrospective meta-analysisCholedocholithiasis16.000.356.2US88.12Mixed meta-analysis72.28–86.564 studiesPancreatic cancer0.0314.458.35EUS-FNA96.00Mixed meta-analysis90.00Mixed meta-analysisInflammatory bowel disease0.4217102.5Endoscopy89.00Journal article64.00–100.00Mixed systematic reviewAcute cholecystitis0.3794006Ultrasound81.00Mixed meta-analysis80.60–89.003 studiesPrimary sclerosing cholangitis0.0149500.125MRCP91.00Prospective meta-analysis74.00–90.901 studyCeliac disease1.0014009.6Serology87.04Mixed meta-analysis76.70–99.60Mixed reviewAcute diverticulitis0.0575382.6CT99.00Prospective meta-analysis81.00–100.002 studiesCirrhosis0.2710.497.37Ultrasound87.96Prospective meta-analysis87.00Mixed meta-analysisEsophageal squamous cell carcinoma0.015.080.007937Endoscopy59.00Retrospective meta-analysis92.50Subgroup of retrospective meta-analysisColorectal polyps6.000.02622.27759747Colonoscopy85.94Guideline87.00Retrospective meta-analysisPeptic ulcer disease0.81110Endoscopy98.95Society technical review (mixed)92.01Mixed meta-analysisAI, artificial intelligence; CT, computed tomography; EUS-FNA, endoscopic ultrasound fine-needle aspiration; GERD, gastroesophageal reflux disease; MRCP, magnetic resonance cholangiopancreatography; US, ultrasound. Open table in a new tab References for Supplementary Table 1Burden and Diagnostic Performance Characteristics by ConditionConditionCommon diagnostic approachAI ApproachGastroesophageal reflux diseaseMoayyedi P, et al. JAMA 2006;295:1566–1576Visaggi P, et al. Aliment Pharmacol Ther 2022;55:528–540Colorectal cancerKnudsen AB, et al. JAMA 2021;325:1998–2011Parkash O, et al. Front Med (Lausanne) 2022;9Irritable bowel syndromeSood R, et al. Aliment Pharmacol Ther 2015;42:491–503Kordi M, et al. Inform Med Unlocked 2022;29:100891Nonalcoholic fatty liver diseaseHernaez R, et al. Hepatology 2011;54:1082–1090Decharatanachart P, et al. Therap Adv Gastroenterol 2021;14Acute pancreatitisSun H, et al. Ann Transl Med 2022;10:410Gorris M, et al. Dig Endosc 2021;33:231–241Hepatocellular carcinomaRoberts LR, et al. Hepatology 2018;67:401–421Martinino A, et al. J Clin Med 2022;11:6368Early gastric cancerZhang Q, et al. Gastric Cancer 2016;19:543–552Jiang K, et al. Front Med (Lausanne) 2021;8CholedocholithiasisGurusamy KS, et al. Cochrane Database Syst Rev 2015;2:CD011548Dalai C, et al. Liver Res 2021;5:224–231Akshintala V, et al. Gastrointest Endosc 2019;89:246–247Jovanovic P, et al. Gastrointest Endosc 2014;80:260–268Tranter-Entwistle I, et al. World J Surg 2021;45:420–428Pancreatic cancerTreadwell JR, et al. Pancreas 2016;45:789Dumitrescu EA, et al. Diagnostics (Basel) 2022;12:309Inflammatory bowel diseaseHommes DW, et al. Gastroenterology 2004;126:1561–1573Stafford IS, et al. Inflamm Bowel Dis 2022;28:1573–1583Acute cholecystitisKiewiet JJ, et al. Centre for Reviews and Dissemination (UK) 2012Yu C, et al. Comput Methods Programs Biomed 2021;211Okuda Y, et al. Acute Med Surg 2022;9:783Lazarenko VA, et al. Res Pract Med J 2017;4:67–72Primary sclerosing cholangitisDave M, et al. Radiology 2010;256:387–396Ringe KI, et al. Eur Radiol 2021;31:2482–2489Iwasawa K, et al. Sci Rep 2018;8:5480Celiac diseaseSheppard AL, et al. Aliment Pharmacol Ther 2022;55:514–527Sana MK, et al. Comput Biol Med 2020;125:103996Acute diverticulitisLaméris W, et al. Eur Radiol 2008;18:2498–2511Kim SW, et al. Sci Rep 2021;11:20390Fred A. Et al. Commun Com Inf Sci 2009;52:1–416CirrhosisCrossan C, et al. NIHR Journals Library 2015;19(9)Decharatanachart P, et al. BMC Gastroenterol 2021;21:10Esophageal squamous cell carcinomaWong MCS, et al. Gastrointest Endosc 2022;96:197–207Lui TKL, et al. Gastrointest Endosc 2020;92:821–830Colorectal polypsKnudsen AB, et al. JAMA 2016;315:2595–2609Nazarian S, et al. J Med Internet Res 2021;23(7)Peptic ulcer diseaseTalley N, et al. Gastroenterology 1998;114:582–595Bang CS, et al. J Med Internet Res 2021;23(12) Open table in a new tab AI, artificial intelligence; CT, computed tomography; EUS-FNA, endoscopic ultrasound fine-needle aspiration; GERD, gastroesophageal reflux disease; MRCP, magnetic resonance cholangiopancreatography; US, ultrasound.

Autoren

Institutionen

Themen

COVID-19 diagnosis using AIArtificial Intelligence in Healthcare and EducationTelemedicine and Telehealth Implementation

Volltext beim Verlag öffnen

The Benefit of Artificial Intelligence-Based Diagnosis in Gastroenterology and Hepatology Is Highly Variable: A Diagnostic Need and Burden Analysis

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen