New fine-tuning of language models: Match meaning, not tokens

Microsoft Research · Advanced ·🧠 Large Language Models ·2h ago

Skills: Fine-tuning LLMs90%

Language models are usually trained to predict the next word, but that does not always lead to the best overall answers. We introduce energy-based fine-tuning, a new method that trains models to produce better full responses, leading to stronger results without the need for complex reward models or verifiers. Project: https://energy-based-fine-tuning.github.io Paper: https://arxiv.org/abs/2603.12248 GitHub: https://github.com/sjelassi/ebft_openrlhf This session aired on May 14, 2026, at Microsoft Research Forum, Season 2 Episode 4. Register for the series to hear about new releases: https://www.microsoft.com/en-us/research/event/microsoft-research-forum/?OCID=msr_researchforum_YTDescription Explore all previous episodes: https://aka.ms/researchforumYTplaylist

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: Fine-tuning LLMs

View skill →

Fine-tuning T5 LLM for Text Generation: Complete Tutorial w/ free COLAB #coding

Fine-tuning T5 LLM for Text Generation: Complete Tutorial w/ free COLAB #coding

Train image classifier using transfer learning - Fine-tuning MobileNet with Keras

Train image classifier using transfer learning - Fine-tuning MobileNet with Keras

Advanced Fine-Tuning in Rust

Advanced Fine-Tuning in Rust

GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)

GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)

LLM Fine-tuning: Two Crucial Tips for New Models - LLama 2

LLM Fine-tuning: Two Crucial Tips for New Models - LLama 2

SDXL LORA STYLE Training! Get THE PERFECT RESULTS!

SDXL LORA STYLE Training! Get THE PERFECT RESULTS!

Related AI Lessons

You Don’t Have to Fine-Tune Your LLM to change it’s Behavior. You Can Just… Steer It.

Learn how to steer an LLM's behavior without fine-tuning using activation steering, a technique that reshapes an AI's personality at runtime

Medium · Machine Learning

You Don’t Have to Fine-Tune Your LLM to change it’s Behavior. You Can Just… Steer It.

Learn how to steer an LLM's behavior without fine-tuning using activation steering, a technique that reshapes an AI's personality at runtime

The Machine That Learns How to Learn: Adaption Labs’ AutoScientist and the Quiet Death of Prompt…

AutoScientist replaces human researchers in running experiments, learn how it works and its implications

Medium · Machine Learning

35 ChatGPT Prompts for Chiropractors (That Actually Work in 2026)

Boost chiropractic practice efficiency with 35 actionable ChatGPT prompts for tasks like SOAP notes and patient education

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)