OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 31.03.2026, 15:38

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Equity and Generalizability of Artificial Intelligence for Skin-Lesion Diagnosis Using Clinical, Dermoscopic, and Smartphone Images: A Systematic Review and Meta-Analysis

2025·1 Zitationen·MedicinaOpen Access
Volltext beim Verlag öffnen

1

Zitationen

2

Autoren

2025

Jahr

Abstract

<i>Background and Objectives</i>: Artificial intelligence (AI) has shown promising performance in skin-lesion classification; however, its fairness, external validity, and real-world reliability remain uncertain. This systematic review and meta-analysis evaluated the diagnostic accuracy, equity, and generalizability of AI-based dermatology systems across diverse imaging modalities and clinical settings. <i>Materials and Methods</i>: A comprehensive search of PubMed, Embase, Web of Science, and ClinicalTrials.gov (inception-31 October 2025) identified diagnostic accuracy studies using clinical, dermoscopic, or smartphone images. Eighteen studies (11 melanoma-focused; 7 mixed benign-malignant) met inclusion criteria. Six studies provided complete 2 × 2 contingency data for bivariate Reitsma HSROC modeling, while seven reported AUROC values with extractable variance. Risk of bias was assessed using QUADAS-2, and evidence certainty was graded using GRADE. <i>Results</i>: Across more than 70,000 test images, pooled sensitivity and specificity were 0.91 (95% CI 0.74-0.97) and 0.64 (95% CI 0.47-0.78), respectively, corresponding to an HSROC AUROC of 0.88 (95% CI 0.84-0.92). The AUROC-only meta-analysis yielded a similar pooled AUROC of 0.88 (95% CI 0.87-0.90). Diagnostic performance was highest in specialist settings (AUROC 0.90), followed by community care (0.85) and smartphone environments (0.81). Notably, performance was lower in darker skin tones (Fitzpatrick IV-VI: AUROC 0.82) compared with lighter skin tones (I-III: 0.89), indicating persistent fairness gaps. <i>Conclusions</i>: AI-based dermatology systems achieve high diagnostic accuracy but demonstrate reduced performance in darker skin tones and non-specialist environments. These findings emphasize the need for diverse training datasets, skin-tone-stratified reporting, and rigorous external validation before broad clinical deployment.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Cutaneous Melanoma Detection and ManagementArtificial Intelligence in Healthcare and EducationAI in cancer detection
Volltext beim Verlag öffnen