Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
MedEqualizer: A Framework Investigating Bias in Synthetic Medical Data and Mitigation via Augmentation
0
Zitationen
4
Autoren
2025
Jahr
Abstract
Synthetic healthcare data generation presents a viable approach to enhance data accessibility and support research by overcoming limitations associated with real-world medical datasets. However, ensuring fairness across protected attributes in synthetic data is critical to avoid biased or misleading results in clinical research and decision-making. In this study, we assess the fairness of synthetic data generated by multiple generative adversarial network (GAN)-based models using the MIMIC-III dataset, with a focus on representativeness across protected demographic attributes. We measure subgroup representation using the logarithmic disparity metric and observe significant imbalances, with many subgroups either underrepresented or overrepresented in the synthetic data, compared to the real data. To mitigate these disparities, we introduce MedEqualizer, a model-agnostic augmentation framework that enriches the underrepresented subgroups prior to synthetic data generation. Our results show that MedEqualizer significantly improves demographic balance in the resulting synthetic datasets, offering a viable path towards more equitable and representative healthcare data synthesis.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.361 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.713 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.243 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.671 Zit.
Artificial intelligence in healthcare: past, present and future
2017 · 4.426 Zit.