Sabaragamuwa University of Sri Lanka

Knowledge-Graph Base Document Chunking and Improve Retrieval's Relevancy of RAG Applications with Recursive Counter Pointing Agent

Show simple item record

dc.contributor.author Perera, W.P.K.I.
dc.contributor.author Vigneshwaran, P.
dc.contributor.author Charles, J.
dc.date.accessioned 2025-12-12T09:43:37Z
dc.date.available 2025-12-12T09:43:37Z
dc.date.issued 2025-02-19
dc.identifier.citation Abstracts of the ComURS2025 Computing Undergraduate Research Symposium 2025, Faculty of Computing, Sabaragamuwa University of Sri Lanka. en_US
dc.identifier.isbn 978-624-5727-57-5
dc.identifier.uri http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/4965
dc.description.abstract Retrieval Augmented Generation (RAG) has emerged as a transformative approach to address key limitations of Large Language Models (LLMs), particularly regarding hallucination, private data access, and contextual accuracy. Both traditional and current RAG systems, while showing promising results, continue to face significant challenges in maintaining semantic accuracy, optimizing retrieval performance, and adapting to complex knowledge structures. Recent implementations like HyDE (Hypothetical Document Embeddings) and Self-RAG have made progress in addressing these issues, but fundamental limitations persist in handling dynamic knowledge structures and maintaining contextual relevance. In this study we introduced an enhanced RAG framework with three novel components: (1) a flexible schema-less knowledge graph that dynamically adapts to new information, (2) an iterative retrieval verification system with integrated web search capabilities that validates extracted information, and (3) an adaptive query refinement mechanism that progressively improves search accuracy. Our approach integrates a schema-less knowledge graph as the primary retrieval source, enhanced by an agent-based architecture where reviewer and query reformer agents work iteratively. These agents dynamically validate and refine queries, while leveraging web sources as a supplementary knowledge base to compensate for temporal gaps in the primary source. This dual-source, agent-driven architecture enables robust handling of complex queries while maintaining up-to-date contextual relevance. We evaluate our framework using both small-context (≤3,000 tokens, Conversational Question Answering (CoQA) and Document Question Answering (DoQA)) and large-context (≥8,000 tokens, INSCIT and TopiOCQA) datasets following Nvidia's benchmark standards. On CoQA, our system achieves higher recall (0.52 vs 0.47) and precision (0.34 vs 0.29) compared to baseline RAG, while maintaining competitive average precision (0.45 vs 0.47) and perfect reciprocal rank. These improvements demonstrate particular effectiveness in complex, high-dimensional contexts, with practical applications in medical record analysis, financial compliance documentation, legal case retrieval, and real-time customer support systems where accuracy and context preservation are crucial. en_US
dc.language.iso en en_US
dc.publisher Faculty of Computing, Sabaragamuwa University of Sri Lanka en_US
dc.subject Retrieval Augmented Generation (RAG) en_US
dc.subject Large Language Model (LLM) en_US
dc.subject Hallucination en_US
dc.subject Knowledge Graph en_US
dc.subject Information retrieval en_US
dc.title Knowledge-Graph Base Document Chunking and Improve Retrieval's Relevancy of RAG Applications with Recursive Counter Pointing Agent en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account