Exploring Agentic Tool-Calling Decisions via Uncertainty-Aligned Reinforcement Learning

본문 미리보기

arXiv:2606.06976v1 Announce Type: new Abstract: Large language model (LLM)-based agents often make suboptimal tool-use decisions, including unsupported tool invocation and hallucinated direct responses, which may accumulate errors throughout multi-step interactions. Existing approaches mainly improve these behaviors through inference-time correction or coarse-grained reward signals based on decision outcomes and structured checklists, leaving the uncertainty characteristics of agent decisions u

Exploring Agentic Tool-Calling Decisions via Uncertainty-Aligned Reinforcement Learning

본문 미리보기

관련 글

Detecting and Mitigating Bias by Treating Fairness as a Symmetry Operation

DiBS: Diffusion-Informed Branch Selection

SafeGene: Reusable Adapters for Transferable Safety Alignment

Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory