A machine learning-based evaluation of English-to-Sinhala translation: comparing Google Translate, large language models, and human translators

Jayathilaka, K.M.D.P.S.D.; Rubasinghe, T.D.; Kumara, B.T.G.S.

Digital Library | SUSL Home
→
Research Publications
→
Proceedings
→
Workshops, Seminars, Symposiums ect
→
Faculty of Computing
→
COMPUTING UNDERGRADUATE RESEARCH SYMPOSIUM
→
ComURS2026 Computing Undergraduate Research Symposium : Abstracts
→
View Item

dc.contributor.author	Jayathilaka, K.M.D.P.S.D.
dc.contributor.author	Rubasinghe, T.D.
dc.contributor.author	Kumara, B.T.G.S.
dc.date.accessioned	2026-05-19T05:46:34Z
dc.date.available	2026-05-19T05:46:34Z
dc.date.issued	2026-01-28
dc.identifier.isbn	978-624-5727-44-5
dc.identifier.uri	http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/5293
dc.description.abstract	Reliable translation from English to Sinhala is still a great challenge for many sophisticated translation systems using Sinhala as a low-resource language. Although Google Translate is widely used for translationpurposes, recent breakthroughs in large language models such as ChatGPTand DeepSeek provide entirely new opportunities for translation tasks. This study proposes one of the first thorough comparative analyses of English-Sinhala translation systems compared with human translation, both qualitatively and quantitatively. Google Translate, ChatGPT, DeepSeek, and human translations done by native Sinhala speakers were compared for translation quality on a carefully prepared dataset of 150English sentences for general, technical, and academic purposes. Translation quality was compared using BLEU, METEOR, and COMETscores, in addition to human assessment of fluency, grammatical accuracy, and semantic translation quality done by qualified human raters using a prepared rubric with inter-rater reliability tests. Machine learning models were also prepared for predicting translation quality using language-basedpredictors for translation efficiency and translation system identification. The experimental results show that human translations were rated highest on all translation quality measures. Among the automatic translationsystems, LLM-based translation systems performed better on contextual understanding of complex sentences than Google Translate, whichperformed reasonably on simple inputs. Correlation tests showthat COMET correlates better with human translation quality than BLEUandMETEOR. Moreover, the prepared machine learning models were able todetect translation quality trends accurately for translation systempredictions, making these models promising for translation qualityassessment in lowresource language environments.	en_US
dc.language.iso	en	en_US
dc.publisher	Faculty of Computing. Sabaragamuwa University of Sri Lanka.	en_US
dc.subject	Large language models	en_US
dc.subject	Low-resource languages	en_US
dc.subject	Machine translation	en_US
dc.subject	Sinhala language	en_US
dc.subject	Translation evaluation	en_US
dc.title	A machine learning-based evaluation of English-to-Sinhala translation: comparing Google Translate, large language models, and human translators	en_US
dc.type	Article	en_US