What is Reinforcement Learning with Human Feedback (RLHF) ?

Name: What is Reinforcement Learning with Human Feedback (RLHF) ?
Uploaded: 2023-05-25T02:40:38Z
Duration: 3 min 34 s
Channel: Data Science in your pocket
Description: RLHF is a method to further fine tune LLMs using a combination of Reward model + PPO algorithm usually which is used for ...

Data Science in your pocket · Beginner ·🧠 Large Language Models ·3:34 ·2y ago

RLHF is a method to further fine tune LLMs using a combination of Reward model + PPO algorithm usually which is used for ...

Watch on YouTube ↗ (saves to browser)

Next Up

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)