Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Assessing multimodal large language models for localizing dental implant fixtures on panoramic radiographs
0
Zitationen
7
Autoren
2026
Jahr
Abstract
• Benchmarked multimodal LLMs for implant localization on panoramic radiographs. • Reasoning-focused models localized about two-thirds of implant fixtures. • Reasoning-focused models reduced false positives per image versus GPT-4o. • Run-to-run variability remained high, so current LLMs cannot be used autonomously. • Findings set realistic expectations for future LLM-assisted implant diagnostics. : To assess whether general-purpose multimodal large language models (LLMs) can localize dental implant fixtures on panoramic radiographs and to quantify false positives on implant fixture-absent images. : Using an open-source dataset, we evaluated 82 implant fixture-present panoramic radiographs (297 fixtures) and 82 implant fixture-absent images balanced by present or absent radiopaque restorations (41 each). We tested three multimodal LLMs (GPT-4o, OpenAI o3, and GPT-5T) with a fixed visual-grounding prompt across five independent runs per image. We scored the outputs using an any-overlap rule within a free-response localization framework. The outcomes on the implant fixture-present images were fixture-level micro sensitivity, image-level complete detection rate (CDR), and false positives per image (FPPI + ). The outcomes on the implant fixture-absent images were image-level specificity (no-box rate) and FPPI - . : On the implant fixture-present images, micro sensitivity was 16.97% for GPT-4o, 68.82% for OpenAI o3, and 65.66% for GPT-5T; CDRs were 2.20%, 59.02%, and 56.59%; and FPPI + values were 3.83, 1.48, and 1.52, respectively. On the implant fixture-absent images, specificity values were 32.68%, 65.85%, and 68.54%, and FPPI - values were 1.95, 1.04, and 0.92, respectively. Radiopaque restorations markedly reduced specificity. The fixtures detected in all five runs were 1.01% (GPT-4o), 22.22% (OpenAI o3), and 25.93% (GPT-5T). : Reasoning-focused multimodal LLMs outperformed GPT-4o in zero-shot implant fixture localization and reduced false positives, but moderate sensitivity, restoration-driven errors, and run-to-run variability limit autonomous clinical use. : This benchmark clarifies the current capabilities and limitations of multimodal LLMs for implant-related radiographic workflows.
Ähnliche Arbeiten
The long-term efficacy of currently used dental implants: a review and proposed criteria of success.
1986 · 3.692 Zit.
The Gingival Index, the Plaque Index and the Retention Index Systems
1967 · 3.661 Zit.
The burden of oral disease: challenges to improving oral health in the 21st century.
2005 · 3.579 Zit.
Staging and grading of periodontitis: Framework and proposal of a new classification and case definition
2018 · 3.113 Zit.
Periodontitis: Consensus report of workgroup 2 of the 2017 World Workshop on the Classification of Periodontal and Peri‐Implant Diseases and Conditions
2018 · 3.104 Zit.