Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Identifying Security Bugs in Issue Reports: Comparison of BERT, N-gram IDF and ChatGPT

2024·3 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

In recent software development, which has become increasingly large and complex, a huge number of issues including bugs, improvements, new feature requests are reported on a daily basis, and there is a risk of missing urgent bugs. Security bugs are particularly urgent because they can cause serious problems such as mal ware infections, and must be resolved quickly. Therefore, it is important to develop a technique to automatically identify security bugs in a large number of issue reports. The goal of this study is to empirically evaluate recent machine learning methods to identify security bugs using issue report text written in natural language as input. Specifically, this paper focuses on the two-class classification model using BERT, a language model based on the Transformer architecture. The model is constructed by fine-tuning a pre-trained model of BERT with the text of issue reports. In our experiment, we performed classification of issue reports obtained from four open source software projects. As a comparison method, we employ a classification model using features obtained by N -gram IDF, which is an extension of the conventional Bag-of- Words approach. We also employ ChatGPT, which is a general-purpose chatbot that utilizes a large-scale language model (LLM). As a result of our experiment, the BERT-based model showed the best classification performance in terms of F1 score. ChatGPT was better than the N-gram IDF based model, but far behind the BERT.

Identifying Security Bugs in Issue Reports: Comparison of BERT, N-gram IDF and ChatGPT

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen