Omni-Perception Policy Optimization for Multimodal Emotion Reasoning

본문 미리보기

arXiv:2606.25325v1 Announce Type: new Abstract: We find that current emotion-oriented Omni-MLLMs still lack reliable omni-modal perception: they (i) underutilize multimodal cues in their reasoning trajectories and (ii) exhibit unfaithful behavior, often hallucinating modality-specific statements from other modalities. Building on these insights, we propose OPPO (Omni-Perception Policy Optimization), a reinforcement learning framework that explicitly optimizes multimodal perception. First, an Om

Omni-Perception Policy Optimization for Multimodal Emotion Reasoning

본문 미리보기

관련 글

The Hitchhiker's Guide to Agentic AI: From Foundations to Systems

Project Auto-World: Towards Automated Benchmarking of Neural Relational Reasoners

Diagnosing and Mitigating Compounding Failures in Agentic Persuasion via Taxonomic Strategy Retrieval

Do vision-language models search like humans? Reasoning tokens as a reaction-time analog in classic visual-search paradigms