BERT fine-tuning with context-aware learning for phrasal verbs

Ibrahim, M.Z.I.A.A.S.; Abishethvarman, V; Prasanth, S; Kumara, B.T.G.S.

Digital Library | SUSL Home
→
Research Publications
→
Proceedings
→
Conferences Organized by SUSL
→
University level conferences
→
International Conference of Sabaragamuwa University of Sri Lanka (ICSUSL)
→
The 10th International Conference of Sabaragamuwa University of Sri Lanka and The 6th China-Sri Lanka Communication and Cooperation Forum
→
View Item

dc.contributor.author	Ibrahim, M.Z.I.A.A.S.
dc.contributor.author	Abishethvarman, V
dc.contributor.author	Prasanth, S
dc.contributor.author	Kumara, B.T.G.S.
dc.date.accessioned	2026-01-17T07:09:51Z
dc.date.available	2026-01-17T07:09:51Z
dc.date.issued	2025-12-03
dc.identifier.issn	2815-0341
dc.identifier.uri	http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/5178
dc.description.abstract	Transformer-based models like BERT have greatly advanced natural language processing by providing deep contextualized language representations. However, while their token prediction capability is strong, their syntax understanding alone does not ensure a clear grasp of semantic meaning, particularly for phrasal verbs. These statements are frequently non-compositional and very context dependent, making them difficult for surface pattern-based algorithms to learn. BERT can predict the next token in a phrasal verb sequence, but it doesn’t always capture the intended meaning. We developed a dataset combining established phrasal verb definitions with 11894 sentences generated by large language models to increase contextual diversity and robustness. Using the pre-trained bert-base-uncased model as a baseline, we applied QLoRA, a lightweight fine-tuning method, to enable efficient adaptation on limited hardware. Model performance was assessed using a variety of semantic and lexical similarity criteria, and fine-tuning considerably improved BERT’s ability to understand subtle phrasal verb meanings. While baseline performance was moderate, notable improvements were observed across all metrics. Future work will extend this approach to more diverse datasets, additional multi-word expressions, and multilingual contexts.	en_US
dc.language.iso	en	en_US
dc.publisher	Sabaragamuwa University of Sri Lanka	en_US
dc.subject	BERT	en_US
dc.subject	Fine-tuning	en_US
dc.subject	Phrasal verbs	en_US
dc.subject	QLoRA	en_US
dc.subject	Semantic classification	en_US
dc.title	BERT fine-tuning with context-aware learning for phrasal verbs	en_US
dc.type	Article	en_US