Sabaragamuwa University of Sri Lanka

Multilayer Perceptron-based Source Code Classification

Show simple item record

dc.contributor.author Mohamed, I.
dc.contributor.author Kumara, B.T.G.S.
dc.contributor.author Banujan, K.
dc.date.accessioned 2023-09-16T06:38:05Z
dc.date.available 2023-09-16T06:38:05Z
dc.date.issued 2022-04-06
dc.identifier.isbn 978-624-5727-21-6
dc.identifier.uri http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/3937
dc.description.abstract One of the most crucial stages in the software development life cycle is the implementation stage. Source code is the most critical component in a software application. Developers develop new source code from scratch or reuse old program code functionalities according to project’s requirements. Instead of developing source code functionalities, most programmers devote considerable time seeking and searching old source files. Therefore, it is critical to have an effective and efficient way for searching source code functions. Topic modeling is one way for extracting topics from source code. Even though statistical modeling techniques have been used to implement several topic modeling approaches, they possess several limitations. Non-formal code components such as method names, identifiers, and comments are used in this regard. The syntax of a language refers to the rules that define its structure. Without syntax, the semantics of a language are nearly impossible to comprehend. Addressing these concerns, the author used a machine-learning algorithm to predict the source code functionality names. The results are solely dependent on the syntax or algorithm of the source code. This study focuses on three Java project functionalities: primary number, Selection sort, and Fibonacci number. The data set was acquired from the Git open-source repository which is an open-source platform supported by developers worldwide. Four hundred and fifty software projects were analyzed, and 23 variables were considered. The source code components are extracted using the Java parser library, creating an abstract syntax tree to extract the source code features precisely. Then an algorithm is developed to get the count matrices of source code features. The data set was then fed into an Artificial Neural Network machine learning model which yielded 95.4% accuracy rate, 95.5% precision, 95.4% recall, and 95.4% F1-score, with a low error rate of 0.033. en_US
dc.language.iso en en_US
dc.publisher Sabaragamuwa University of Sri Lanka en_US
dc.subject Artificial Neural Network en_US
dc.subject Source Code en_US
dc.subject Java Parser library en_US
dc.subject Abstract Syntax Tree en_US
dc.title Multilayer Perceptron-based Source Code Classification en_US
dc.type Book en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account