RTSGameBench: An RTS Benchmark for Strategic Reasoning by Vision-Language Models | AIChainDay

AI2026년 6월 18일AI 점수: 95%

RTSGameBench: An RTS Benchmark for Strategic Reasoning by Vision-Language Models

출처:arXiv cs.AI

✨ AI 인사이트

🧑‍💻 개발자

1.RTSGameBench: 실시간 전략게임 'Beyond All Reason' 기반 VLM 전략추론 벤치마크 공개
2.개별 전략 역량을 진단하는 미니게임과 자가진화형 문제 생성 프레임워크 제공
3.대규모 RTS 운용을 위해 FSM·에이전트 메모리 기반 RTSGameAgent 함께 제시
4.최신 VLM들이 긴밀한 협응·다중에이전트 협력·대규모 태스크에서 부진함을 실증

💡

왜 중요한가?

기존 RTS 벤치마크가 평가 범위와 역량 진단이 제한적이던 문제를, 확장 가능한 미니게임과 자가진화 생성으로 보완해 VLM의 불확실성 하 장기 전략추론 약점을 체계적으로 드러낸다.

🏷️ 언급 프로젝트

RTSGameBench Beyond All Reason RTSGameAgent

본문 미리보기

arXiv:2606.18950v2 Announce Type: new Abstract: Modern Vision-Language Models (VLMs) often struggle with strategic reasoning, i.e., anticipating and influencing other agents' actions, under uncertainty in competitive and cooperative settings. Real-time strategy (RTS) games can be a natural testbed for diagnosing this limitation, as they demand coordination with allies, adaptation to opponents' strategy, and long-horizon planning under partial observability. However, existing RTS benchmarks offe

전체 내용이 궁금하다면?

원문을 직접 읽어보세요

원문 보기