dCollection 디지털 학술정보 유통시스템

효율적인 통합형 멀티에이전트 RAG 프레임워크 : 문서 추론을 위한 MCP 통합 기반 접근

An Efficient Integrated Multi-Agent RAG Framework : An MCP Integration-Based Approach for Document Reasoning

원문보기

주제(키워드) Retrieval-Augmented Generation(RAG) , Model Context Protocol(MCP) , 멀티에이전트 프레임워크 , Agentic AI , 문맥 동기화(Context Synchronization) , 객체지향 멀티에이전트(Object-Oriented Multi-Agent) , 온디바이스 인공지능(On-Device general multi-agent reasoning)
발행기관 서강대학교 AI.SW대학원
지도교수 양지훈
발행년도 2026
학위수여년월 2026. 2
학위명 석사
학과 및 전공 AI.SW대학원 데이터사이언스 · 인공지능
실제URI http://www.dcollection.net/handler/sogang/000000082215
UCI I804:11029-000000082215
본문언어 한국어
저작권 논문은 저작권에 의해 보호받습니다.

초록(요약문)

Retrieval-Augmented Generation(RAG)은 대규모 언어모델(LLM)의 사실적 추론을 강화하기 위한 핵심 기술로 자리 잡았으나, 기존 RAG 기반 시스템은 검색 효율성 저하, 근거 불일치, 그리고 멀티에이전트 환경에서의 확장성 부족이라는 구조적 한계를 갖는다. [1], [3], [5]. 한편, 최근 주목받는 멀티에이전트 오케스트레이션 프레임워크는 역할 분담과 모듈화된 추론을 가능하게 하지만, 공유 문맥 동기화 및 추론 추적성(traceability)을 보장하는 표준화된 통신 프로토콜이 부재하다 [7], [24]. 이러한 한계는 효율적이고 해석 가능한 범용 멀티에이전트 추론(General Multi-Agent Reasoning) 구현을 어렵게 만든다. 본 연구는 기존 Docsray [16] 프레임워크의 계층적 검색 구조를 기반으로, 에이전트 간 통신 효율성과 문맥 일관성을 강화하기 위해 Model Context Protocol(MCP)을 통합한 최적화된 아키텍처를 제안한다. 제안된 프레임워크는 DocsRay [16]가 추출한 의사 목차(pseudo-TOC) 정보를 MCP의 표준화된 메시지 스키마로 변환하여 검색(Retriever)과 검증(Verifier) 에이전트 간의 상태 동기화를 보장한다. 이를 통해 기존의 단순 연결 방식 대비 데이터 병목 현상을 줄이고, 추론 과정의 추적성(Traceability)을 확보하였다. 또한, 본 연구는 검색된 근거와 생성된 응답 간의 의미적 정합성을 평가하는 CCS(Context Consistency Score)를 정의하고, 이를 핵심 요소로 차용하여 최종 답변 생성까지의 엔드투엔드 문맥 동기화 지연 시간을 통합한 CS-Index를 제안함으로써, 제안된 멀티에이전트 통합 구조의 신뢰성과 효율성을 동시에 검증하였다. MMLongBench-Doc [31]과 SlideVQA [32], M3SciQA [33] 3개의 벤치마크 실험 결과, 제안된 RAG-MCP 통합 구조는 기존 단일 에이전트 방식 대비 문맥 동기화 지연(Sync Latency)을 단축하고, 4B 규모의 경량 모델에서도 안정적인 성능을 유지함으로써 자원 제약 환경에서의 효율성(Efficiency)을 입증하였다. 본 연구는 새로운 RAG 알고리즘을 제안하는 것을 넘어, 기존 문서 추론 모델의 운영 효율성을 극대화할 수 있는 통합 아키텍처를 제시하였다는 데 의의가 있다.

초록(요약문)

Retrieval-Augmented Generation (RAG) has become a key paradigm for enhancing factual reasoning in large language models (LLMs). However, conventional RAG systems suffer from structural limitations such as retrieval bottlenecks, evidence misalignment, and lack of scalability in multi-agent environments [1], [3], [5]. Meanwhile, recent multi-agent orchestration frameworks enable modular reasoning and task distribution but lack a standardized communication protocol to ensure context synchronization and traceability [7], [24]. These limitations hinder the realization of efficient and interpretable General Multi-Agent Reasoning. This study proposes an optimized architecture that integrates the Model Context Protocol (MCP) to enhance communication efficiency and context consistency among agents, building upon the hierarchical retrieval structure of the existing DocsRay [16] framework. The proposed framework converts pseudo-Table-of-Contents (pseudo-TOC) information extracted by DocsRay into a standardized MCP message schema, ensuring state synchronization between the Retriever and Verifier agents. This approach mitigates data bottlenecks compared to conventional simple connection methods and secures traceability throughout the reasoning process. Furthermore, this study introduces the Context Consistency Score (CCS) as a semantic coherence metric between retrieved evidence and generated responses, and further defines a CS-Index that integrates CCS with end-to-end context synchronization latency. This combined metric quantitatively evaluates how effectively a multi-agent system preserves contextual quality while delivering responses in a time-efficient manner. 8 Through comprehensive experiments conducted on three benchmark datasets, including MMLongBench-Doc [31], SlideVQA [32], and M3SciQA [33], the proposed RAG-MCP integrated architecture demonstrates superior efficiency in resource-constrained environments. This is achieved by significantly reducing the Context Synchronization Latency compared to conventional single-agent methods, while maintaining stable performance even with a 4-billion-parameter (4B) scale lightweight model. This study is significant not merely for proposing a new RAG algorithm, but for presenting a unified architecture that maximizes the operational efficiency of existing document reasoning models.

제 1장 서론 11
제 1절 연구의 배경 및 필요성 11
제 2절 연구의 목적 12
제 3절 연구의 의의 및 기대효과 13
제 4절 연구의 구성 13

제 2장 관련 연구 (Related Work) 15
제 1절 연구 통합 개요 (Overview of Research Integration) 15
제 2절 RAG 및 밀집 검색 (Dense Retrieval: DPR, FiD) 16
제 3절 적응형 검색 및 자기 검증 (Adaptive Retrieval and Self-Verification)
16
제 4절 멀티에이전트 오케스트레이션과 MCP 18
제 5절 문서 구조화 및 계층적 검색 (Document Structuring and Hierarchical
Retrieval) 20
제 6절 경량 및 자원 적응형 검색 (Lightweight and Resource-Adaptive RAG)
21
제 7절 점진적 도메인 적응 (Progressive Domain Transfer System: PDTS) 23
제 8절 장문 문서 및 다중모달 벤치마크 (Benchmarks) 23
제 9절 비교 프레임워크 및 차별성 (Comparative Frameworks and
Differentiation) 24
제 10 절 DocsRay 기반 RAG-MCP 통합 프레임워크 단계별 구성 24
제 11 절 수학적 모델 정의 (Mathematical Modeling) 26

제 3장 연구 방법론 (Research Methodology) 28
제 1절 연구 설계 개요 (Method / Design Motivation) 28
제 2절 프레임워크 설계 (Framework Design) 28
제 3 절 의사 목차 기반 계층적 검색 (Pseudo-TOC–Based Hierarchical Retrieval)
제 4절 문맥 일관성 정량 지표 (Context Consistency Score Metrics) 33
제 5절 데이터셋 구성 (Dataset Configuration) 33
제 6절 실험 환경 및 설정 (Experimental Setup) 41
제 7절 평가 및 대조군 설정 42
제 8절 평가 지표 42

제 4장 실험 및 결과 44
제 1절 실험 개요 44
제 2절 질의응답(QA) 정확도 평가 44
제 3절 문맥 동기화 성능 평가 48
제 4절 자원 효율성 평가 53
제 5절 교차문서 및 교차페이지 추론 성능 57
제 6절 모듈 기여도 분석 (Ablation Study) 60
제 7절 공개 소스 LLM (Llama-3-70B) 기반 일반화 성능 분석 64
제 8절 CCS (Consistency-aware Confidence Score) 가중치 민감도 분석 65
제 9절 재현성 및 검증 66
제 10 절 종합 논의 및 시사점 67

제 5장 결론 70
제 1절 연구 결과의 요약 70
제 2절 학문적 및 기술적 기여 70
제 3절 응용 가능 분야 71
제 4절 연구의 한계 73
제 5절 향후 연구 방향 74

참고문헌 77
부록 A. 전체 상세 실험 결과 및 분석 81
부록 B. SlideVQA·M3SciQA 보수적 검증(Conservative Verification) 절차
95
부록 C. 재현 실험 프로토콜 (Reproducible Experiment Protocol) 99
부록 D. 재현성 안내서 (Reproducibility README) 111

반출 Meta View 목록

서강대학교

검색 상세

효율적인 통합형 멀티에이전트 RAG 프레임워크 : 문서 추론을 위한 MCP 통합 기반 접근

초록(요약문)

초록(요약문)

목차