Trust Between AI Agents: Measuring Formation, Breakage, and Recovery, with Implications for Governing Multi-Agent Systems | AIChainDay

🇰🇷 한국어 요약by Claude · 2026. 6. 16.

AI 에이전트 간 신뢰를 '비용이 드는 검증(costly verification)' 행동으로 측정하는 프레임워크를 제안한다. 협력형 생존 게임에서 팀원 작업을 검증하면 자원이 소모되고 틀린 답을 신뢰하면 치명적이므로, 검증을 줄이는 정도가 신뢰의 관측 지표가 된다. 6개 프론티어 모델 스냅샷 분석 결과 Claude Opus 4.6·Sonnet 4.6·GPT-5.1·Gemini 3.1 Pro는 신뢰할 만한 팀원과 협력 시 검증을 약 60~85% 줄인 반면, 소형 모델 둘은 거의 조정하지 않았다. 신뢰 회복은 형성보다 느리고 연속 실패가 의심을 더 오래 지속시켰으며, 신뢰를 형성하는 모델일수록 검증을 덜 하고 더 빨리 결정해 높은 보상을 얻어, 다중 에이전트 거버넌스에서 최대 의심이 아닌 보정(calibration)이 핵심임을 시사한다.

•협력 생존 게임에서 '비용 드는 검증' 감소량을 AI 에이전트 신뢰의 행동 지표로 정의
•Claude Opus 4.6·Sonnet 4.6·GPT-5.1·Gemini 3.1 Pro는 신뢰 팀원에게 검증 60~85% 감소, 소형 모델은 거의 미조정
•신뢰 회복은 형성보다 느리고, 분산된 실패보다 집중된 연속 실패가 의심을 더 오래 유지
•신뢰 형성 모델일수록 검증 감소·신속 결정·높은 보상, 과잉 검증은 안전이 아닌 우유부단으로 연결
•배포 전 신뢰 성향 측정 가능, 거버넌스 핵심은 최대 의심이 아닌 보정

AI2026년 6월 16일

Trust Between AI Agents: Measuring Formation, Breakage, and Recovery, with Implications for Governing Multi-Agent Systems

출처:arXiv cs.AI

본문 미리보기

arXiv:2606.14923v1 Announce Type: new Abstract: As language-model agents increasingly work in teams, each agent must decide how much to trust its teammates. Yet we lack a standard way to measure trust between AI agents. We propose a behavioral measure based on costly verification. In a cooperative survival game, checking a teammate's work consumes resources, while trusting a wrong answer can be fatal. Relative to a memoryless version of the same model, reduced verification provides an observabl

전체 내용이 궁금하다면?

원문을 직접 읽어보세요

원문 보기

2시간 전

Dr-DCI: Scaling Direct Corpus Interaction via Dynamic Workspace Expansion

arXiv:2606. 14885v1 Announce Type: new Abstract: Agentic search over large corpora relies on retriever-mediated interfaces (e. g. , BM25 or ColBERT) for scalable candidate discovery. While effective at ranking relevant documents, these interfaces expose evidence only as ranked results or bounded doc

📰미디어arXiv cs.AI

원문

Trust Between AI Agents: Measuring Formation, Breakage, and Recovery, with Implications for Governing Multi-Agent Systems

본문 미리보기

관련 글

Dr-DCI: Scaling Direct Corpus Interaction via Dynamic Workspace Expansion

Relational Structural Causal Models

Cognitive Debt: AI as Intellectual Leverage and the Dynamics of Systemic Fragility

Towards Verifiable Agentic Data Science: Solving Irregular TSQA Via Tool-Grounded Reasoning