Navigating User Behavior toward Personalized Multimodal Generation | AIChainDay

🇰🇷 한국어 요약by Claude · 2026. 6. 24.

NaviGen은 사용자의 상호작용 기록을 실행 가능한 생성 지시문으로 변환하는 개인화 콘텐츠 생성 프레임워크다. 기존 AIGC 파이프라인은 잘 작성된 지시문을 전제하지만 사용자는 시각적 세부를 명확히 표현하지 못해 생성기가 수요와 어긋난다. NaviGen은 두 가지 난점, 즉 행동을 언어 추론이 읽을 수 있는 형태로 인코딩하는 문제와 사전학습·행동 데이터에 없는 지시문 작성 능력 습득 문제를 해결한다. 각 아이템을 협업 코드와 텍스트 코드를 결합한 이중 식별자로 표현해 행동 기반과 의미 다리를 하나의 토큰 스트림에 담고, SFT+RL 2단계 파이프라인으로 진화 탐색된 지도신호에서 선호 추론과 지시문 작성을 증류한 뒤 계층적·자기일관 보상으로 생성을 사용자 의도에 정렬한다. 제품·게임·숏폼 도메인 실험에서 개인화 이미지·영상 생성과 다음 아이템 예측을 개선하고 더 구체적이고 생성 가능한 지시문을 산출했다.

•사용자 상호작용 기록을 실행 가능한 생성 지시문으로 변환하는 개인화 생성 프레임워크
•협업 코드+텍스트 코드 이중 식별자로 행동 기반과 의미를 한 토큰 스트림에 통합
•SFT+RL 2단계로 선호 추론·지시문 작성 증류 후 계층·자기일관 보상으로 의도 정렬
•제품·게임·숏폼 도메인에서 개인화 이미지·영상 생성 개선
•다음 아이템 예측 강화 및 더 구체적·생성 가능한 지시문 산출

AI2026년 6월 24일

Navigating User Behavior toward Personalized Multimodal Generation

출처:arXiv cs.AI

본문 미리보기

arXiv:2606.24196v1 Announce Type: new Abstract: Modern AIGC pipelines deliver high-fidelity images and videos but presuppose a well-formed creation instruction, while end users rarely articulate visual details, leaving generators misaligned with user demand. We study personalized content generation, which turns a user's interaction history into an executable instruction for downstream synthesis, and identify two obstacles: behavior must be encoded in a form legible to language reasoning, and th

전체 내용이 궁금하다면?

원문을 직접 읽어보세요

원문 보기

2시간 전

Critique of Agent Model

arXiv:2606. 23991v1 Announce Type: new Abstract: What is an agent? What constitutes agency? With the rise of Large Language Model (LLM) systems marketed as ``coding agents'', ``AI co-scientists'', and other ``agentic" tools that promise to drive up productivity, and at the same time, ``existential"

📰미디어arXiv cs.AI

원문

Navigating User Behavior toward Personalized Multimodal Generation

본문 미리보기

관련 글

Critique of Agent Model

Can Language Model Agents be Helpful Circuit Explainers in Mechanistic Interpretability?

Beyond Trajectory Imitation: Strategy-Guided Policy Optimization for LLM Reasoning

Exploring Academic Influence of Algorithms by Co-occurrence Network Based on Full-text of Academic Papers