Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Ensuring Data Integrity: The Role of Data Engineering and Pipe lines in Labeling AI-Generated Images and Videos

2024·0 Zitationen·Journal of Artificial Intelligence & Cloud ComputingOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

The proliferation of Artificial Intelligence (AI) models such as Generative Adversarial Networks (GANs) has shown impressive success in image synthesis. This capability can enhance content and media but also poses threats to legitimacy, authenticity, and security. As AI transitions from research to deployment, creating appropriate datasets and data pipelines to develop and evaluate AI models is increasingly the biggest challenge. Automated AI model builders that are publicly available can now achieve top performance in many applications. This paper discusses the importance of data engineering and pipelines in creating curated and clean data services, emphasizing the role of labeling AI-generated content to mitigate misinformation. It summarizes referenced findings from large-scale experiments on labeling effectiveness and highlights challenges in designing, evaluating, and implementing labeling policies. Key considerations for each stage of the data-for-AI pipeline-starting from data design to data sculpting (for example, cleaning, valuation, and annotation) and data evaluation—are discussed to make AI more reliable.

Autoren

Arjun Mantri

Institutionen

Seattle University(US)

Themen

Artificial Intelligence in Healthcare and EducationPrivacy-Preserving Technologies in Data

Volltext beim Verlag öffnen

Ensuring Data Integrity: The Role of Data Engineering and Pipe lines in Labeling AI-Generated Images and Videos

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen