Abstract:
Software quality plays an important role in software engineering by maintaining system reliability and ensuring efficiency. Achieving high-quality software mainly depends on software testing. Regression testing is important to this process, but it is very time-consuming and resource-intensive. Test case prioritization (TCP) techniques can optimize this process. It reduces test time and optimizes resource usage. Conventional TCP mechanisms, like coverage-based and risk-based prioritization, have limited ways of handling software structures. This study compared different machine learning algorithms like Decision Tree, Random Forest, Neural Networks, K-Nearest Neighbor, and Logistic Regression to identify the best technique for TCP using object-oriented metrics like Coupling Between Objects (CBO), Weighted Methods per Class (WMC), Depth of Inheritance Tree (DIT), Number of Children (NOC), and Lack of Cohesion in Methods (LCOM). Used a dataset from Zenodo, which includes 232,468 observations and 53 attributes for this research. Each observation in the dataset represents a software file, class, or method. The dataset is divided into training and testing sets using data preprocessing and feature selection methods. Models are trained using the training dataset and tested using the testing dataset. Then, the trained models were evaluated using performance metrics like accuracy, precision, recall, and F1-score. Finally, each model is compared using evaluated results to get the best-performing model. The decision tree outperformed others in TCP due to its ability to manage decision boundaries with minimal overfitting. It achieved 71% accuracy, reducing the execution time for testing by 32.5% and improving the detection of errors by 15.8% over the traditional methods. This result highlighted their huge potential to increase regression testing efficiency and software quality.