Analyzing the Narration Gap in LLM-Solver Loops | AIChainDay

🇰🇷 한국어 요약by Claude · 2026. 6. 19.

안전·보안이 중요한 질문을 논리로 정식화할 수 있을 때 SAT·SMT 솔버가 언어모델 추론 파이프라인에 점점 더 결합되는데, 솔버는 체인오브소트와 달리 독립 검증 가능한 건전한 답을 내지만 솔버와 모델의 상호작용에서 건전성 보장이 깨질 수 있다. 하이브리드 파이프라인은 질문 정식화·결정·결과 서술(narration) 세 단계로 구성되며, 기존 연구가 다루지 않은 '서술' 단계—형식 도구의 출력을 사용자 답변으로 바꾸는 단계—의 공백을 조명한다. 저자들은 LLM-솔버 루프를 검증된 결정 절차로 모델링하고 5개 오픈소스 모델을 프롬프트 인젝션 하에 평가해, 인증서 게이팅이 솔버 판정을 건전하게 만들지만 공격자가 표현·채널을 바꿔 검증된 결론을 뒤집을 수 있음을 발견했다. 강화 프롬프트는 인젝션을 크게 줄이나 완전히 제거하지 못하고 적응적 공격에 여전히 취약해, 루프의 견고성이 최종 사용자 답변까지 도달하지 못함을 보인다.

•LLM-솔버 하이브리드를 정식화·결정·서술 3단계로 분해, 연구가 적은 '서술' 공백을 조명
•인증서 게이팅은 솔버 판정을 건전하게 하지만 공격자가 표현·채널을 바꿔 결론을 뒤집을 수 있음
•5개 오픈소스 모델을 프롬프트 인젝션 하에서 평가
•강화 프롬프트도 인젝션을 완전 제거하지 못해 견고성이 최종 답변까지 도달하지 않음

AI2026년 6월 19일

Analyzing the Narration Gap in LLM-Solver Loops

출처:arXiv cs.AI

본문 미리보기

arXiv:2606.19588v1 Announce Type: new Abstract: Formal tools such as SAT and SMT solvers are increasingly embedded in language model reasoning pipelines when a safety or security critical question can be formulated in logic. Unlike chain of thought whose steps are sampled from the model distribution without formal guarantee, a solver produces a sound and independently verifiable answer. However, the soundness guarantee can be lost in the interaction between the solver and the model. The hybrid

전체 내용이 궁금하다면?

원문을 직접 읽어보세요

원문 보기

2시간 전

Deontic Policies for Runtime Governance of Agentic AI Systems

arXiv:2606. 19464v1 Announce Type: new Abstract: Autonomous agentic AI systems driven by Large Language Models (LLMs) introduce a new class of security, privacy, and compliance challenges: an agent that can invoke tools, manipulate data, install software, and coordinate with peer agents across organ

📰미디어arXiv cs.AI

원문

Analyzing the Narration Gap in LLM-Solver Loops

본문 미리보기

관련 글

Deontic Policies for Runtime Governance of Agentic AI Systems

Diffusion Language Models: An Experimental Analysis

Hidden Anchors in Multi-Agent LLM Deliberation

LLM Doesn't Know What It Doesn't Know: Detecting Epistemic Blind Spots via Cross-Model Attribution Divergence on Clinical Tabular Data