Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
DeepQuali: Initial results of a study on the use of large language models for assessing the quality of user stories
0
Zitationen
5
Autoren
2026
Jahr
Abstract
Generative artificial intelligence (GAI), specifically large language models (LLMs), are increasingly used in software engineering, mainly for coding tasks. However, requirements engineering - particularly requirements validation - has seen limited application of GAI. The current focus of using GAI for requirements is on eliciting, transforming, and classifying requirements, not on quality assessment. We propose and evaluate the LLM-based (GPT-4o) approach "DeepQuali", for assessing and improving requirements quality in agile software development. We applied it to projects in two small companies, where we compared LLM-based quality assessments with expert judgments. Experts also participated in walkthroughs of the solution, provided feedback, and rated their acceptance of the approach. Experts largely agreed with the LLM's quality assessments, especially regarding overall ratings and explanations. However, they did not always agree with the other experts on detailed ratings, suggesting that expertise and experience may influence judgments. Experts recognized the usefulness of the approach but criticized the lack of integration into their workflow. LLMs show potential in supporting software engineers with the quality assessment and improvement of requirements. The explicit use of quality models and explanatory feedback increases acceptance.
Ähnliche Arbeiten
Computers and Intractability: A Guide to the Theory of NP-Completeness
1979 · 44.591 Zit.
Usability Engineering
1993 · 9.355 Zit.
Software engineering: A practitioner's approach
1983 · 8.297 Zit.
Quasi-experimentation: Design and analysis issues for field settings
1980 · 6.109 Zit.
Extreme Programming Explained: Embrace Change
2000 · 5.889 Zit.