Sabaragamuwa University of Sri Lanka

Sentiment Analysis of Self-Published vs. Traditionally Published Books using Machine Learning

Show simple item record

dc.contributor.author Jayasundara, J.M.G.N.
dc.contributor.author Adeeba, S.
dc.date.accessioned 2026-05-27T05:40:49Z
dc.date.available 2026-05-27T05:40:49Z
dc.date.issued 2026-01-28
dc.identifier.isbn 9786245727445
dc.identifier.uri http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/5306
dc.description.abstract The rapid growth of the self-publishing channels, such as Amazon Kindle Direct Publishing (KDP), has greatly changed the model of distribution of books across the world today because authors are able to bypass the traditional publishing framework. The conventional publishers have well-established editorial and marketing procedures, but the self-published authors have full creative freedom with more or less quality control. However, in spite of this change, there is a shortage of academic studies that use computational sentiment analysis to compare the per- ception of books by the readers under these two models in a systematic way. In this work, this gap is filled by comparing the attitudes of readers to self-published and traditionally published books based on the large dataset of Goodreads reviews. The study aims at (1) determining the patterns of sentiment between the two publishing models, (2) identifying the critical themes that determine the perceptions of the reader, and (3) assessing the contribution of platform vis- ibility and metadata in modulating the trend of sentiment. Text normalization, tokenization, and publisher classification by metadata preprocessed the reviews. In identifying the reviews in self-published and traditionally published classes, a TF-IDF vectorizer and a Logistic Regres- sion classifier were used. This model was able to accomplish an accuracy of 0.80 with a test sample of 125,757. The performance measures showed a precision of 0.82 and a recall of 0.78 for self-published books and a precision of 0.79 and a recall of 0.82 for traditionally published books. Furthermore, a DistilBERT model was used as an additional robustness test. The find- ings indicate that the sentiment of readers is fairly equal on both publishing models; however, selfpublished books have a greater diversification of sentiment distribution. The consistency of traditional books is probably higher because of the professional editing and the publication organization. The research has implications for those publishing and those being published in terms of marketing approaches, content suggestions and implications to authors in their choice of publication pathway. en_US
dc.language.iso en en_US
dc.publisher Faculty of Computing. Sabaragamuwa University of Sri Lanka. en_US
dc.subject Goodreads en_US
dc.subject Machine Learning en_US
dc.subject Sentiment Analysis en_US
dc.subject SelfPublishing en_US
dc.subject Traditional Publishing en_US
dc.title Sentiment Analysis of Self-Published vs. Traditionally Published Books using Machine Learning en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account