New fine-tuning of language models: Match meaning, not tokens

Microsoft Research · Advanced ·🧠 Large Language Models ·2h ago
Language models are usually trained to predict the next word, but that does not always lead to the best overall answers. We introduce energy-based fine-tuning, a new method that trains models to produce better full responses, leading to stronger results without the need for complex reward models or verifiers. Project: https://energy-based-fine-tuning.github.io Paper: https://arxiv.org/abs/2603.12248 GitHub: https://github.com/sjelassi/ebft_openrlhf This session aired on May 14, 2026, at Microsoft Research Forum, Season 2 Episode 4. Register for the series to hear about new releases: https://www.microsoft.com/en-us/research/event/microsoft-research-forum/?OCID=msr_researchforum_YTDescription Explore all previous episodes: https://aka.ms/researchforumYTplaylist
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

You Don’t Have to Fine-Tune Your LLM to change it’s Behavior. You Can Just… Steer It.
Learn how to steer an LLM's behavior without fine-tuning using activation steering, a technique that reshapes an AI's personality at runtime
Medium · Machine Learning
You Don’t Have to Fine-Tune Your LLM to change it’s Behavior. You Can Just… Steer It.
Learn how to steer an LLM's behavior without fine-tuning using activation steering, a technique that reshapes an AI's personality at runtime
Medium · LLM
The Machine That Learns How to Learn: Adaption Labs’ AutoScientist and the Quiet Death of Prompt…
AutoScientist replaces human researchers in running experiments, learn how it works and its implications
Medium · Machine Learning
35 ChatGPT Prompts for Chiropractors (That Actually Work in 2026)
Boost chiropractic practice efficiency with 35 actionable ChatGPT prompts for tasks like SOAP notes and patient education
Dev.to AI
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →