Sabaragamuwa University of Sri Lanka

A machine learning-based evaluation of English-to-Sinhala translation: comparing Google Translate, large language models, and human translators

Show simple item record

dc.contributor.author Jayathilaka, K.M.D.P.S.D.
dc.contributor.author Rubasinghe, T.D.
dc.contributor.author Kumara, B.T.G.S.
dc.date.accessioned 2026-05-19T05:46:34Z
dc.date.available 2026-05-19T05:46:34Z
dc.date.issued 2026-01-28
dc.identifier.isbn 978-624-5727-44-5
dc.identifier.uri http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/5293
dc.description.abstract Reliable translation from English to Sinhala is still a great challenge for many sophisticated translation systems using Sinhala as a low-resource language. Although Google Translate is widely used for translationpurposes, recent breakthroughs in large language models such as ChatGPTand DeepSeek provide entirely new opportunities for translation tasks. This study proposes one of the first thorough comparative analyses of English-Sinhala translation systems compared with human translation, both qualitatively and quantitatively. Google Translate, ChatGPT, DeepSeek, and human translations done by native Sinhala speakers were compared for translation quality on a carefully prepared dataset of 150English sentences for general, technical, and academic purposes. Translation quality was compared using BLEU, METEOR, and COMETscores, in addition to human assessment of fluency, grammatical accuracy, and semantic translation quality done by qualified human raters using a prepared rubric with inter-rater reliability tests. Machine learning models were also prepared for predicting translation quality using language-basedpredictors for translation efficiency and translation system identification. The experimental results show that human translations were rated highest on all translation quality measures. Among the automatic translationsystems, LLM-based translation systems performed better on contextual understanding of complex sentences than Google Translate, whichperformed reasonably on simple inputs. Correlation tests showthat COMET correlates better with human translation quality than BLEUandMETEOR. Moreover, the prepared machine learning models were able todetect translation quality trends accurately for translation systempredictions, making these models promising for translation qualityassessment in lowresource language environments. en_US
dc.language.iso en en_US
dc.publisher Faculty of Computing. Sabaragamuwa University of Sri Lanka. en_US
dc.subject Large language models en_US
dc.subject Low-resource languages en_US
dc.subject Machine translation en_US
dc.subject Sinhala language en_US
dc.subject Translation evaluation en_US
dc.title A machine learning-based evaluation of English-to-Sinhala translation: comparing Google Translate, large language models, and human translators en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account