Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Comparative Evaluation of ChatGPT, DeepSeek and Gemini in Automatic Unit Test Generation: a Success Rate Analysis

2025·0 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

The advancement of large-scale language models (LLMs) has opened up new possibilities for automating unit test generation, a traditionally manual and expensive task. This quantitative study evaluates the performance of three LLMs-ChatGPT 4o mini, DeepSeek v3, and Gemini 2.5 Flash Pro-in generating test cases for methods in C# developed in Unity. The execution success rate of the generated tests was measured using real and synthetic data. The synthetic data was intentionally created to represent common structures, while the real data came from existing project functions. The experimental design was controlled and included the factors LLM and data type and the blocks cyclomatic complexity and contextual memory with four replicates per combination, for a total of 96 experimental treatments. The results show that LLMs have a high potential to support the automatic generation of unit tests. Furthermore, it was evidenced that the choice of model has a significant effect on the success rate of the generated tests. These findings provide useful initial evidence to guide the selection and use of LLMs in test automation processes within software development environments

Autoren

Institutionen

Universidad de Costa Rica(CR)

Themen

Software Testing and Debugging TechniquesArtificial Intelligence in Healthcare and EducationSoftware Engineering Techniques and Practices

Volltext beim Verlag öffnen

A Comparative Evaluation of ChatGPT, DeepSeek and Gemini in Automatic Unit Test Generation: a Success Rate Analysis

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen