Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

MedEqualizer: A Framework Investigating Bias in Synthetic Medical Data and Mitigation via Augmentation

2025·0 Zitationen·ArXiv.orgOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Synthetic healthcare data generation presents a viable approach to enhance data accessibility and support research by overcoming limitations associated with real-world medical datasets. However, ensuring fairness across protected attributes in synthetic data is critical to avoid biased or misleading results in clinical research and decision-making. In this study, we assess the fairness of synthetic data generated by multiple generative adversarial network (GAN)-based models using the MIMIC-III dataset, with a focus on representativeness across protected demographic attributes. We measure subgroup representation using the logarithmic disparity metric and observe significant imbalances, with many subgroups either underrepresented or overrepresented in the synthetic data, compared to the real data. To mitigate these disparities, we introduce MedEqualizer, a model-agnostic augmentation framework that enriches the underrepresented subgroups prior to synthetic data generation. Our results show that MedEqualizer significantly improves demographic balance in the resulting synthetic datasets, offering a viable path towards more equitable and representative healthcare data synthesis.

Autoren

Themen

Machine Learning in HealthcarePrivacy-Preserving Technologies in DataArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

MedEqualizer: A Framework Investigating Bias in Synthetic Medical Data and Mitigation via Augmentation

Abstract

Ähnliche Arbeiten

Autoren

Themen