Sabaragamuwa University of Sri Lanka

A Study of Machine Learning Models for Text-Based Mental Health Prediction in Sri Lanka

Show simple item record

dc.contributor.author Vitharana, K.S.N.
dc.contributor.author Kumara, P.G.P.
dc.date.accessioned 2026-06-02T05:00:56Z
dc.date.available 2026-06-02T05:00:56Z
dc.date.issued 2026-01-28
dc.identifier.isbn 978-624-5727-44-5
dc.identifier.uri http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/5312
dc.description.abstract The widespread use of social media has revolutionized the way people share personal problems, which is a new line of identifying the occurrence of mental illness at an early stage. This approach becomes especially vital in such circumstances as the Sri Lankan context, when cultural stigma is a real obstacle in the process of seeking help. In order to fill the gap of machine learning applications in this field, this paper explores the automated detection of mental health conditions on Facebook posts in Sinhalese. A mental health expert annotated a corpus of 3,096 posts with a multi-label classification schema (Anxiety, Depression, Suicidal Ideation, Irrelevant) to indicate possible comorbidities. A traditional Random Forest classifier and the new transformer-based models, BERT and RoBERTa with explicit hyperparameter settings, were tested and compared to perform this multilabel classification task. The performance analysis indicated that there are serious gaps. The Random Forest model obtained a low efficacy, indicated by its macro F1-score of 0.45, which is poor at predicting the important suicidal ideation class (F1-score: 0.33). This baseline was significantly low in comparison to the transformer models. The BERT model had a strong macro F1-score of 0.83, and the RoBERTa model had the best overall score of 0.85. These findings indicate the superiority of transformer-based models, namely RoBERTa, in this sensitive classification task. The analysis shows that natural language processing has the potential to be successfully used to detect the indicators of mental distress in the specific sociolinguistic environment of Sri Lanka. While limitations include depending on a single annotator and platform-specific data, ethical issues associated with the application in the real world. This study serves as a foundation for developing proactive digital solutions that can enforce mental health surveillance and early intervention, potentially overcoming the stigmatization barrier. en_US
dc.language.iso en en_US
dc.publisher Faculty of Computing. Sabaragamuwa University of Sri Lanka. en_US
dc.subject Mental Health Detection en_US
dc.subject Multi-Label Classification en_US
dc.subject Natural Language Processing (NLP) en_US
dc.subject RoBERTa en_US
dc.subject Transformer Models en_US
dc.title A Study of Machine Learning Models for Text-Based Mental Health Prediction in Sri Lanka en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account