How to train Multi Agent Collaborative Agents with Reinforcement Learning (CTDE Explained)

Neural Breakdown with AVB · Beginner ·🤖 AI Agents & Automation ·5mo ago

Skills: Agent Foundations90%Multi-Agent Systems90%Tool Use & Function Calling80%Autonomous Workflows80%

In this video, we train Multi-agent Navigation AI agents to collaborate in complex obstacle courses. We learned the basics of creating custom Reinforcement Learning environments, how to design observation spaces, action spaces, and reward spaces, as well as the basics of LCS (local coordinate systems) in agentic systems. We then talk about Actor Critic methods like A2C and PPO, and how to train agents using them. We discuss two multi-agent RL algorithms, namely Independent PPO (I-PPO) and the more advanced Multi Agent PPO (MA-PPO). MA-PPO is inspired by MA-DDPG, which is a Centralized Training Decentralized Execution (CTDE) RL method. We learn why CTDE methods are great at training multi-agent RL environments and why they can promote cooperative and emergent behaviours in RL agents. The GitHub repo: https://github.com/avbiswas/navigation-mappo-rl The longer code explainer video is available for Patreon members: https://www.patreon.com/posts/multi-agent-rl-145270524 Follow me on Twitter: https://x.com/neural_avb To join our Patreon, visit: https://www.patreon.com/NeuralBreakdownwithAVB Members get access to everything behind-the-scenes that goes into producing my videos - including code. Plus, it supports the channel in a big way and helps to pay my bills. #machinelearning #reinforcementlearning #programming #devlog Relevant videos: Intro to Reinforcement Learning - https://youtu.be/Qpx6WD0qekQ GRPO and reasoning LLMs - https://youtu.be/yGkJj_4bjpE RL Playlist - https://www.youtube.com/playlist?list=PLGXWtN1HUjPfays8_pu4nQOW47Q6pzaGP Useful papers: - An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning (https://arxiv.org/abs/2409.03052) - PPO paper (https://arxiv.org/pdf/1707.06347) - MARL in Pytorch (https://docs.pytorch.org/rl/main/tutorials/multiagent_ppo.html) - MA-DDPG (https://arxiv.org/abs/1706.02275) Timestamps: 0:00 - Intro 2:17 - Creating RL environments 6:23 - Local Coordinate Sys

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: Agent Foundations

View skill →

Build and Deploy an Agent with Reasoning Engine in Vertex AI

Adding a Phone Gateway to a Virtual Agent

From Zero to Working AI Agent in 60 Seconds

From Zero to Working AI Agent in 60 Seconds

Create An AI Agent With Replit That Automates Your Sales

Create An AI Agent With Replit That Automates Your Sales

Capstone: Autonomous Runway Detection for IoT

Capstone: Autonomous Runway Detection for IoT

AI Agents with Model Context Protocol & Typescript

AI Agents with Model Context Protocol & Typescript

Related AI Lessons

How to Use Claude Skills: A Beginner’s Guide

Learn how to use Claude skills with a beginner's guide and understand what a skill is in the context of AI

AI Is Too Expensive To Replace Humans

AI replacement of humans is too expensive, making it a less viable option for businesses, which is crucial for entrepreneurs and product managers to understand when planning their strategies

AI Is Too Expensive To Replace Humans

AI is too expensive to replace human workers, making it a complementary tool rather than a replacement

Medium · Machine Learning

ChatGPT Can Now See Your Bank Account — And That Changes Everything

ChatGPT's integration with Plaid connects 200 million AI users to 12,000+ banks, raising questions about money, privacy, and security

Medium · ChatGPT

Chapters (3)

Intro

2:17 Creating RL environments

6:23 Local Coordinate Sys

NEW Google Gemini AI Agent is INSANE!

Julian Goldie SEO