WorldLines: Benchmarking and Modeling Long-Horizon Stateful Embodied Agents

출처:arXiv cs.AI

✨ AI 인사이트

🧑‍💻 개발자

1.WorldLines: 장기 임보디드 가정 지원을 위한 프로젝트 주도형 장기지평 벤치마크
2.대화·행동·실행 피드백·상태 변화를 담은 시간 확장 가정 권적을 근거 연결 샘플로 변환
3.가시성 인식 메모리와 행동 기반 상태 추적을 유지하는 ObsMem 프레임워크 제안
4.부분 관측·상태 덮어쓰기·메모리의 계획 반영 등 지속적 난제를 실험으로 드러냄

💡

왜 중요한가?

기존 장기 메모리 벤치마크가 언어 중심 검색·QA에 치우치고 임보디드 벤치마크는 단기 실행에 머물던 공백을, 동적 환경에서 장기 기억을 실제 행동 계획으로 잇는 평가로 메운다.

🏷️ 언급 프로젝트

WorldLines ObsMem

본문 미리보기

arXiv:2606.18847v1 Announce Type: new Abstract: To assist humans over extended periods in real homes, embodied agents must remember user routines, world states, and past interactions. Existing long-term memory benchmarks mainly evaluate language-centric retrieval and question answering, while embodied benchmarks often focus on short-horizon task execution without testing long-term memory use in dynamic environments. We introduce WorldLines, a project-driven benchmark for long-horizon embodied h

전체 내용이 궁금하다면?

원문을 직접 읽어보세요

원문 보기

#임바디드 AI#장기 메모리#벤치마크#에이전트

WorldLines: Benchmarking and Modeling Long-Horizon Stateful Embodied Agents

본문 미리보기

관련 글

MosaicLeaks: Can your research agent keep a secret?

CaVe-VLM-CoT: An Interpretable Vision-Language Model Framework

What Must Generalist Agents Remember?

ProfiLLM: Utility-Aligned Agentic User Profiling for Industrial Ride-Hailing Dispatch