Show HN: LlamaGym – fine-tune LLM agents with online reinforcement learning

📰 Hacker News · KhoomeiK

Show HN: LlamaGym – fine-tune LLM agents with online reinforcement learning. 28 comments, 239 points on Hacker News.

Published 10 Mar 2024