Uncertainty Decomposition for Clarification Seeking in LLM Agents | AIChainDay

🇰🇷 한국어 요약by Claude · 2026. 6. 19.

대화형 LLM 에이전트를 위해 고전적 우연적/인식론적 불확실성 틀이 불충분하다는 지적에 답해, 행동 확신도와 요청 불확실성(u)을 분리하는 단순한 프롬프트 기반 분해 기법을 제안한다. 블랙박스 API·지연 예산·라벨 궤적 부재라는 배포 제약상 logprob·다중샘플링·학습 기반 방법은 배제되므로 프롬프트 기반 추정이 가장 현실적이며, 이 분해로 과제 명세가 모호할 때 에이전트가 먼저 명확화를 요청하게 한다. 평가를 위해 과제 50%를 의도적으로 미명세화한 WebShop-Clarification·ALFWorld-Clarification 벤치마크를 도입하고 5개 백본(GPT-5.1, DeepSeek-v3.2-exp, GLM-4.7, Qwen3.5-35B, GPT-OSS-120B)에서 ReAct+UE·UAM과 비교했다. 평균적으로 ALFWorld-Clarification에서 명확화 F1을 ReAct+UE 대비 73%, UAM 대비 36% 개선해 이득이 단일 모델을 넘어 일반화됨을 보였다.

•행동 확신도와 요청 불확실성을 분리하는 프롬프트 기반 분해로 능동적 명확화 요청 가능
•배포 제약상 logprob·다중샘플링·학습 기반을 배제, 프롬프트 추정이 가장 현실적
•과제 50%를 미명세화한 WebShop·ALFWorld-Clarification 벤치마크 신규 도입
•ALFWorld-Clarification에서 명확화 F1을 ReAct+UE 대비 73%, UAM 대비 36% 개선(5개 백본 평균)

AI2026년 6월 19일

Uncertainty Decomposition for Clarification Seeking in LLM Agents

출처:arXiv cs.AI

본문 미리보기

arXiv:2606.19559v1 Announce Type: new Abstract: Recent position papers argue that the classical aleatoric/epistemic uncertainty framework is insufficient for interactive large language model (LLM) agents and call for underspecification-aware, decomposed, and communicable uncertainty representations that can unlock new agent capabilities such as proactive clarification seeking and shared mental-model building. Practical deployment constraints -- black-box APIs, interactive latency budgets, and t

전체 내용이 궁금하다면?

원문을 직접 읽어보세요

원문 보기

2시간 전

Deontic Policies for Runtime Governance of Agentic AI Systems

arXiv:2606. 19464v1 Announce Type: new Abstract: Autonomous agentic AI systems driven by Large Language Models (LLMs) introduce a new class of security, privacy, and compliance challenges: an agent that can invoke tools, manipulate data, install software, and coordinate with peer agents across organ

📰미디어arXiv cs.AI

원문

Uncertainty Decomposition for Clarification Seeking in LLM Agents

본문 미리보기

관련 글

Deontic Policies for Runtime Governance of Agentic AI Systems

Diffusion Language Models: An Experimental Analysis

Hidden Anchors in Multi-Agent LLM Deliberation

LLM Doesn't Know What It Doesn't Know: Detecting Epistemic Blind Spots via Cross-Model Attribution Divergence on Clinical Tabular Data