Sabaragamuwa University of Sri Lanka

Deep Learning Approach for Classifying AI-generated and Human-written Sinhala Answers

Show simple item record

dc.contributor.author Ranathunga, R.A.D.K
dc.contributor.author Rupasingha, R.A.H.M.
dc.contributor.author Kumara, B.T.G.S.
dc.date.accessioned 2025-12-15T07:47:35Z
dc.date.available 2025-12-15T07:47:35Z
dc.date.issued 2025-02-19
dc.identifier.citation Abstracts of the ComURS2025 Computing Undergraduate Research Symposium 2025, Faculty of Computing, Sabaragamuwa University of Sri Lanka. en_US
dc.identifier.isbn 978-624-5727-57-5
dc.identifier.uri http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/4986
dc.description.abstract With increased use of AI on answering academic questions, a real concern arises on cases of cheating especially where languages such as Sinhala are hardly supported by effective detection systems. As a result, it becomes difficult to distinguish content written by AI from human written content, which compromises and distorts fair evaluations and originality within education. To address this issue, this study introduces a deep learning model to differentiate AI-generated and human-written Sinhala answers. A proposed model is presented that enables the recognition of real Sinhala answers written by humans and those generated by AI. This help to prevent cheating in academic settings while overcoming shortages of resources for the Sinhala language study. A stepwise methodology is used, with data gathering at the first stage, which includes 1000 questions from academic areas of history, science, business studies and Buddhism with human-provided answers, and artificially intelligent responses. The text pre-processing like stemming, tokenization and elimination of stop-words are imposed on the data. Term frequency-inverse document frequency transforms the textual data into numerical forms that can be fed to actual learning algorithms. Then, the two algorithms were used such as Artificial Neural Networks (ANN) and the Long Short-Term Memory (LSTM). Based on the results LSTM with 86% accuracy out performs the accuracy of the ANN, therefore can be conclude to LSTM is better than the ANN. As well as the recall, F1-score and error values better in LSTM. Different hyperparameters and percentage split used for the evaluations. Data collection and computer issues are few challenges faced during this research. This research provides a solution for cheat detection in low frequent languages. It forms the basis for subsequent work that aims to detect content in AI underrepresented languages; further work will use cosine similarity to explore the relationship between lecturer and AI responses. en_US
dc.language.iso en en_US
dc.publisher Faculty of Computing, Sabaragamuwa University of Sri Lanka en_US
dc.subject AI-generated en_US
dc.subject Classification en_US
dc.subject Deep learning en_US
dc.subject Human-written en_US
dc.subject Sinhala language en_US
dc.title Deep Learning Approach for Classifying AI-generated and Human-written Sinhala Answers en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account