DynSess: Dynamic Session-Level Evaluation and Optimization Framework for Role-Playing Agents
📰 ArXiv cs.AI
arXiv:2605.29256v1 Announce Type: cross Abstract: Role-playing with large language models is fundamentally a session-level task, requiring agents to sustain character identity and interaction quality across extended multi-turn conversations. Yet existing evaluation and optimization methods remain largely turn-level, failing to capture long-horizon quality. We propose DynSess, a unified session-level framework for role-playing agents. DynSess-Eval scores complete dialogue sessions via rubrics tar
DeepCamp AI