AgentFly: Fine-tuning LLM Agents Without LLM Fine-tuning - Paper Overview
This academic paper introduces AgentFly, a novel learning paradigm for Large Language Model (LLM) agents that allows for continuous adaptation without the need to fine-tune the underlying LLMs. The core of AgentFly is a memory-based online reinforcement learning framework, formalized as a Memory-augmented Markov Decision Process (M-MDP). This system stores past experiences in an episodic memory (either differentiable or non-parametric), using a neural case-selection policy to guide actions, and continually updates this policy based on environmental feedback through memory rewriting and efficie…
Watch on YouTube ↗
(saves to browser)
DeepCamp AI