SE-Bench: Benchmarking Self-Evolution with Knowledge Internalization
📰 ArXiv cs.AI
arXiv:2602.04811v2 Announce Type: replace-cross Abstract: True self-evolution requires agents to act as lifelong learners that internalize novel experiences to solve future problems. However, rigorously measuring this foundational capability is hindered by two obstacles: the entanglement of prior knowledge, where ``new'' knowledge may appear in pre-training data, and the entanglement of reasoning complexity, where failures may stem from problem difficulty rather than an inability to recall learn
DeepCamp AI