EvoMemBench: Benchmarking Agent Memory from a Self-Evolving Perspective

📰 ArXiv cs.AI

arXiv:2605.18421v1 Announce Type: cross Abstract: Recent benchmarks for Large Language Model (LLM) agents mainly evaluate reasoning, planning, and execution. However, memory is also essential for agents, as it enables them to store, update, and retrieve information over time. This ability remains under-evaluated, largely because existing benchmarks do not provide a systematic way to assess memory mechanisms. In this paper, we study agent memory from a self-evolving perspective and introduce EvoM

Published 19 May 2026

Read full paper → ← Back to Reads