An explainable AI-based approach for Java code smell detection to improve software maintainability

Wedaarachchi ., R; Herath, G.A.C.A.; Wasalthilaka, W.V.S.K.

Digital Library | SUSL Home
→
Research Publications
→
Proceedings
→
Workshops, Seminars, Symposiums ect
→
Faculty of Computing
→
COMPUTING UNDERGRADUATE RESEARCH SYMPOSIUM
→
ComURS2026 Computing Undergraduate Research Symposium : Abstracts
→
View Item

dc.contributor.author	Wedaarachchi ., R
dc.contributor.author	Herath, G.A.C.A.
dc.contributor.author	Wasalthilaka, W.V.S.K.
dc.date.accessioned	2026-06-03T09:35:53Z
dc.date.available	2026-06-03T09:35:53Z
dc.date.issued	2026-01-28
dc.identifier.isbn	978-624-5727-44-5
dc.identifier.uri	http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/5322
dc.description.abstract	Code smells are early indicators of design flaws that compromise long term software maintainability. Traditional static analysis tools often generate false positives or fail to detect context dependent smells because they rely on fixed thresholds. As software systems become sophisticated, machine learning (ML) approaches provide a viable alternative for capturing deeper structural and semantic patterns. In order to detect three maintainability critical code smells in Java applications Complex Method, Long Parameter List, and Multifaceted Abstraction; this study proposes an explainable machine learning driven methodology. Using the DACOS dataset, containing 10,341 method level and 4,424 class level samples, separate binary classification pipelines were developed for each smell to guarantee sufficient granularity and domain specific feature engineering. Models, including XGBoost, LightGBM, Random Forest, Gradient Boosting, Decision Tree, SVM, KNN, and Logistic Regression, were trained using an 80/20 stratified split for preserving class distribution, SMOTETomek for resampling, and Grid- SearchCV for hyperparameter tuning to enable an unbiased performance assessment. From intending individual model performance, the top three models for each code smell were integrated to build weighted voting ensemble model to further improve robustness. The findings suggest relatively consistent performance across the cross-validation folds. For Multifaceted Abstraction, XGBoost achieved an accuracy of 0.9401 (SD=0.0043) and an F1 score of 0.9073 (SD=0.0059), while for Complex Method and Long Parameter List, XGBoost achieved accuracy 0.8222 (SD=0.0101) and 0.8104 (SD=0.0085), respectively. Key features were highlighted in the instance level explanations provided by LIME. These insights increase transparency and enable better informed refactoring approaches. However, the results are derived from only a single dataset and a limited number of code smell type, which may limit their generalizability and external validity. Overall, this study presents the potential of a reliable, precise, and interpretable ML based technique for automated code smell detection within the evaluated context to software maintainability.	en_US
dc.language.iso	en	en_US
dc.publisher	Faculty of Computing. Sabaragamuwa University of Sri Lanka.	en_US
dc.subject	Code smells	en_US
dc.subject	Explainable AI	en_US
dc.subject	Java	en_US
dc.subject	Machine learning	en_US
dc.subject	Software maintainability	en_US
dc.title	An explainable AI-based approach for Java code smell detection to improve software maintainability	en_US
dc.type	Article	en_US