Abstract:
Bug priority prediction is a time-consuming manual operation, yet it is crucial to
the software development process. After reviewing each report, developers
should assign a priority number. The study's main goal is to improve predicted
performance by automating the current manual priority allocation process
utilizing a combination of machine learning techniques. The majority of
researchers use one or two feature extraction methods and no studies done to
compare the outcomes of combining different feature extraction techniques with
ensemble methods with Long Short-Term Memory (LSTM) and Artificial Neural
Networks (ANN). To fill this research gap, Bugzilla gathered more than 20,000
bug reports. After preprocessing by stemming and tokenization, LSTM with
feature vectorization techniques was used in conjunction with Word2Vec, GloVe,
and Term Frequency-Inverse Document Frequency. Using the aforementioned
feature vectors, a parallel ANN is utilized to construct another model for result
comparison. Following the creation of three LSTM models utilizing three feature
vectors, the results of each model were compared and combined to create an even
better model known as an ensemble model. The findings of LSTM individual
models, an ensemble model, and an ANN model were then compared in order to
determine which model is best for bug prioritizing. Accuracy, precision, recall,
and f-measure were used as evaluations along with MAE and MSE. The accuracy
of the LSTM-TF-IDF model was 88.94%, the LSTM-gloVe model was 89.58%,
the LSTM-word2vec model was 84.84%, and the ANN model was 80.28%. In
addition, the accuracy of the ensemble model was 92%. The ensemble model
achieves the lower error rates and highest values of other evaluation methods.
The results of this study will help developers and programmers address errors
more rapidly than in the past. In the future, data from other sources will be
gathered and utilized by deep algorithms to improve accuracy.