Knowledge-Graph Base Document Chunking and Improve Retrieval's Relevancy of RAG Applications with Recursive Counter Pointing Agent

Perera, W.P.K.I.; Vigneshwaran, P.; Charles, J.

Digital Library | SUSL Home
→
Research Publications
→
Proceedings
→
Workshops, Seminars, Symposiums ect
→
Faculty of Computing
→
COMPUTING UNDERGRADUATE RESEARCH SYMPOSIUM
→
Abstracts of the ComURS2025 Computing Undergraduate Research Symposium 2025
→
View Item

dc.contributor.author	Perera, W.P.K.I.
dc.contributor.author	Vigneshwaran, P.
dc.contributor.author	Charles, J.
dc.date.accessioned	2025-12-12T09:43:37Z
dc.date.available	2025-12-12T09:43:37Z
dc.date.issued	2025-02-19
dc.identifier.citation	Abstracts of the ComURS2025 Computing Undergraduate Research Symposium 2025, Faculty of Computing, Sabaragamuwa University of Sri Lanka.	en_US
dc.identifier.isbn	978-624-5727-57-5
dc.identifier.uri	http://repo.lib.sab.ac.lk:8080/xmlui/handle/susl/4965
dc.description.abstract	Retrieval Augmented Generation (RAG) has emerged as a transformative approach to address key limitations of Large Language Models (LLMs), particularly regarding hallucination, private data access, and contextual accuracy. Both traditional and current RAG systems, while showing promising results, continue to face significant challenges in maintaining semantic accuracy, optimizing retrieval performance, and adapting to complex knowledge structures. Recent implementations like HyDE (Hypothetical Document Embeddings) and Self-RAG have made progress in addressing these issues, but fundamental limitations persist in handling dynamic knowledge structures and maintaining contextual relevance. In this study we introduced an enhanced RAG framework with three novel components: (1) a flexible schema-less knowledge graph that dynamically adapts to new information, (2) an iterative retrieval verification system with integrated web search capabilities that validates extracted information, and (3) an adaptive query refinement mechanism that progressively improves search accuracy. Our approach integrates a schema-less knowledge graph as the primary retrieval source, enhanced by an agent-based architecture where reviewer and query reformer agents work iteratively. These agents dynamically validate and refine queries, while leveraging web sources as a supplementary knowledge base to compensate for temporal gaps in the primary source. This dual-source, agent-driven architecture enables robust handling of complex queries while maintaining up-to-date contextual relevance. We evaluate our framework using both small-context (≤3,000 tokens, Conversational Question Answering (CoQA) and Document Question Answering (DoQA)) and large-context (≥8,000 tokens, INSCIT and TopiOCQA) datasets following Nvidia's benchmark standards. On CoQA, our system achieves higher recall (0.52 vs 0.47) and precision (0.34 vs 0.29) compared to baseline RAG, while maintaining competitive average precision (0.45 vs 0.47) and perfect reciprocal rank. These improvements demonstrate particular effectiveness in complex, high-dimensional contexts, with practical applications in medical record analysis, financial compliance documentation, legal case retrieval, and real-time customer support systems where accuracy and context preservation are crucial.	en_US
dc.language.iso	en	en_US
dc.publisher	Faculty of Computing, Sabaragamuwa University of Sri Lanka	en_US
dc.subject	Retrieval Augmented Generation (RAG)	en_US
dc.subject	Large Language Model (LLM)	en_US
dc.subject	Hallucination	en_US
dc.subject	Knowledge Graph	en_US
dc.subject	Information retrieval	en_US
dc.title	Knowledge-Graph Base Document Chunking and Improve Retrieval's Relevancy of RAG Applications with Recursive Counter Pointing Agent	en_US
dc.type	Article	en_US