Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
SHAPE: Symptom-based Hallucination-driven Augmented Prompt Engineering (Preprint)
0
Zitationen
10
Autoren
2025
Jahr
Abstract
<sec> <title>BACKGROUND</title> Recent advances in large language models (LLMs) have fueled rapid growth in artificial intelligence (AI) research across multiple domains. The performance of AI models heavily depends on the quality and quantity of available data. However, in the medical domain, the availability of high-quality data is limited towing to the sensitivity of patient information and strict privacy regulations. </sec> <sec> <title>OBJECTIVE</title> LLMs frequently generate hallucinations, which are often regarded as a limitation of medical AI because they can introduce incorrect information. This study sought to turn this limitation into an advantage by deliberately inducing hallucinations to generate clinically meaningful symptom–disease classification data while preserving patient privacy. We developed the symptom-based hallucination-driven augmented prompt engineering (SHAPE) framework and demonstrated its practical applicability by validating the data generated via SHAPE. </sec> <sec> <title>METHODS</title> Key features (disease name, main symptoms, and current symptoms) were extracted from base data preprocessed under medical expert supervision. Easy data augmentation techniques—random swap and random insertion—were applied to enhance diversity; these were combined with few-shot examples and input into GPT-4o-mini and GPT-4.1-mini to generate 156,000 synthetic symptom-based patient questions paired with disease labels. Dataset quality was evaluated using t-distributed stochastic neighbor embedding (t-SNE) visualization, expert clinical review using a 4-point scale, and classification performance of Bidirectional Encoder Representations from Trt-SNE visualization showed that the generated data clustered well by disease category, reflecting symptom similarity. Clinical experts rated 86.9% of the disease categories with an average score of three or higher, indicating strong clinical plausibility. Classification accuracy using KLUE-BERT reached up to 86%, comparable to results reported in studies using the MIMIC-III dataset.ansformers (BERT)-based models (KLUE-BERT, RoBERTa, DeBERTa-v3) on an independent test set. </sec> <sec> <title>RESULTS</title> t-SNE visualization showed that the generated data clustered well by disease category, reflecting symptom similarity. Clinical experts rated 86.9% of the disease categories with an average score of three or higher, indicating strong clinical plausibility. Classification accuracy using KLUE-BERT reached up to 86%, comparable to results reported in studies using the MIMIC-III dataset. </sec> <sec> <title>CONCLUSIONS</title> SHAPE demonstrates that deliberately induced hallucinations can be harnessed to construct high-quality, privacy-preserving synthetic datasets suitable for training disease classification models. This approach provides a scalable and ethically compliant alternative for medical AI development in environments in which direct patient data access is legally or ethically restricted. </sec>
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.611 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.504 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.025 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.835 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.