Execute code with sandboxes for Deep Agents

LangChain · Intermediate ·🤖 AI Agents & Automation ·7mo ago

Skills: Agent Foundations80%Tool Use & Function Calling70%

Key Takeaways

The video introduces Sandboxes for Deep Agents, a new set of integrations that allow users to safely execute arbitrary code and bash commands in remote sandboxes, using providers such as Runloop, Daytona, and Modal, and demonstrates how to use the Deep Agent CLI to execute code in a sandbox.

Full Transcript

Hey, I'm VC and in this video I'm excited to introduce sandboxes for deep agents. We're going to talk about what these are and why you might want to use them in developing your deep agents. So, a common thing that you might do is you might have your local machine that's running your deep agent. And a common ask that we hear is you want to safely run the code that your agent is generating, but you don't want to mess up the machine that you're working on because the the agent could be generating arbitrary code. So what do we do to fix that? Well, the first thing we might do is we want to figure out what we want our sandbox to have. So what we might do is we'll say, hey, in my sandbox, I'm working on this GitHub repo. So I want to pull that down and I want to install these custom packages maybe something from pip. So now we have a remote sandbox. Today we support three three providers for our sandboxes. Those are run loop, Daytona and modal. Great. So now we have our local machine running our deep agent and we've connected to some remote sandbox. But what actually lives in that sandbox? Really it's two things. The first thing is that the sandbox has a file system similar to your local file system where you can create files, you can edit them, you can store them. And the other thing is an execute tool. So this functions as a remote shell. So you can run all of your generated code in the remote sandbox without having to worry about things getting messed up on your local machine. Great. So now that we've set up the sandbox, how do we actually use it? Well, the way that it works is by exposing a tool call to execute the code. So on your local machine when you're chatting with your deep agent, um you might tell it to execute some code. What that actually does is it takes that it takes that command and then it goes to the remote sandbox and uses the execute tool. So that will run the bash tool in the remote shell. For example, you want to run some Python script. So it'll go and run this over here in the remote sandbox. And then the third step is it'll take the output from that script and then we'll send it back to your deep agent running here. So that way your deep agent can operate this loop where it does a tool call. That tool call gets executed in the remote sandbox, but it can always see all the outputs and make a decision on what to do next alongside you. Great. So this is a sort of a diagram of how it works. Let's dive into a code example. We're going to be going over an example using the deep agent CLI. So, the first thing I wanted to show you was what's an example of some of the stuff that might live in that setup script that we talked about before. So, this has a ton of stuff in it. The main thing that I really want you to take a look at is that all I really want to do is pass in my GitHub token and then pull down one of the repos I've been working on. For us, we're going to be doing some work in the deep agents repo which lives in my GitHub. Great. Once we do that, we can launch [snorts] the deep agent CLA. So you'll see a few things happen as this is running. The first thing that we see is that it tells us the ID of the run loop sandbox that's created. The other thing it says is the setup script that I specified completed successfully. So now what we can do is we can go do some work. So I'll go off and do that and then show you what happened. Okay. So let's go and review some of the work that I did with the deep agent in the sandbox. The first thing I did was I gave it a task which was go and read the deep agents folder and go and read the read me in there. Just make sure you understand the project. And then the task I actually gave it was I was trying to test out creating a new tool for a deep agent. And what I wanted to do was have it create the tool, create a sample for it, and then actually run the tool. So here we have create a sample tool Python script. Uh I give it the specification which is take in a JSON file, return all the top level keys and then also create a test and then run that tool with Python. So after I give it this task, you can see it goes and it starts doing work in the sandbox. So it goes it lists some of its memories. So those live on my local machine and then what it goes and does is it reads the DB agent's folder that I pulled down from git reads the readme and then what it starts to do is it starts to create the file that I told it to which is JSON keys tool.py. We can see the diff so I can always go and review what it's creating. Um it looked good to me. And then what I had to do was also create a test. So test sample.json. Remember all of this is happening in the sandbox file system. Great. So then what I had to do was go and actually execute that command. So again this is happening in the sandbox using the execute tool. So then it goes it runs that command and then it tells us the location is in the remote sandbox. It tells us that hey I went I read the deep agents readme created the tool and then I ran it and it overall it gives us the summary of the files that I created. Finally, what I had it do was go and just uh submit a PR for this. You can kind of do anything you want. You can um push it somewhere else. Uh this can just be sort of test code that you run, but again, all of this runs in the sandbox, so you can safely execute code there. So, I hope that was a fun and useful demo of how you can use sandboxes to both execute code safely and do real work with your deep agents. As you can see here, that PR that we made, it's here. I can compare and PR it. If you want to learn more or just get straight into building, check out our deep agents repo. We're really excited to see some of the cool stuff that you build. Until next time, thanks.

Original Description

We're excited to launch Sandboxes for DeepAgents, a new set of integrations that allow you to safely execute arbitrary code and bash commands in remote sandboxes. Your DeepAgent runs locally (or wherever you want), but when it needs to execute code, create files, or run commands, those operations happen in the remote sandbox. - Learn more: http://blog.langchain.com/execute-code-with-sandboxes-for-deepagents/ - See the docs: https://github.com/langchain-ai/deepagents - Learn how to build Deep Agents on LangChain Academy: https://academy.langchain.com/courses/deep-agents-with-langgraph/?utm_medium=social&utm_source=youtube&utm_campaign=q4-2025_youtube-academy-links_aw - Observe, evaluate, and deploy agents with LangSmith: https://smith.langchain.com/?utm_medium=social&utm_source=youtube&utm_campaign=q4-2025_youtube-links_aw

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from LangChain · LangChain · 0 of 60

← Previous Next →

Chat With Your Documents Using LangChain + JavaScript

Chat With Your Documents Using LangChain + JavaScript

LangChain SQL Webinar

LangChain SQL Webinar

LangChain "OpenAI functions" Webinar

LangChain "OpenAI functions" Webinar

LangSmith Launch

LangSmith Launch

LangChain x Pinecone: Supercharging Llama-2 with RAG

LangChain x Pinecone: Supercharging Llama-2 with RAG

LangChain Expression Language

LangChain Expression Language

Building LLM applications with LangChain with Lance

Building LLM applications with LangChain with Lance

Benchmarking Question/Answering Over CSV Data

Benchmarking Question/Answering Over CSV Data

LangChain "RAG Evaluation" Webinar

LangChain "RAG Evaluation" Webinar

Fine-tuning in Your Voice Webinar

Fine-tuning in Your Voice Webinar

Tabular Data Retrieval

Tabular Data Retrieval

Building an LLM Application with Audio by AssemblyAI

Building an LLM Application with Audio by AssemblyAI

Superagent Deepdive Webinar

Superagent Deepdive Webinar

Lessons from Deploying LLMs with LangSmith

Lessons from Deploying LLMs with LangSmith

Shortwave Assistant Deepdive Webinar

Shortwave Assistant Deepdive Webinar

Cognitive Architectures for Language Agents

Cognitive Architectures for Language Agents

Effectively Building with LLMs in the Browser with Jacob

Effectively Building with LLMs in the Browser with Jacob

Data Privacy for LLMs

Data Privacy for LLMs

"Theory of Mind" Webinar with Plastic Labs

"Theory of Mind" Webinar with Plastic Labs

LangChain Templates

LangChain Templates

Using Natural Language to Query Postgres with Jacob

Using Natural Language to Query Postgres with Jacob

Building a Research Assistant from Scratch

Building a Research Assistant from Scratch

Benchmarking RAG over LangChain Docs

Benchmarking RAG over LangChain Docs

Skeleton-of-Thought: Building a New Template from Scratch

Skeleton-of-Thought: Building a New Template from Scratch

Benchmarking Methods for Semi-Structured RAG

Benchmarking Methods for Semi-Structured RAG

LangSmith Highlights: Getting Started

LangSmith Highlights: Getting Started

LangSmith Highlights: Debugging

LangSmith Highlights: Debugging

LangSmith Highlights: Datasets

LangSmith Highlights: Datasets

LangSmith Highlights: Evaluation

LangSmith Highlights: Evaluation

LangSmith Highlights: Human Annotation

LangSmith Highlights: Human Annotation

LangSmith Highlights: Monitoring

LangSmith Highlights: Monitoring

LangSmith Highlights: Hub

LangSmith Highlights: Hub

SQL Research Assistant

SQL Research Assistant

Getting Started with Multi-Modal LLMs

Getting Started with Multi-Modal LLMs

Build a Full Stack RAG App With TypeScript

Build a Full Stack RAG App With TypeScript

Auto-Prompt Builder (with Hosted LangServe)

Auto-Prompt Builder (with Hosted LangServe)

LangChain v0.1.0 Launch: Introduction

LangChain v0.1.0 Launch: Introduction

LangChain v0.1.0 Launch: Observability

LangChain v0.1.0 Launch: Observability

LangChain v0.1.0 Launch: Integrations

LangChain v0.1.0 Launch: Integrations

LangChain v0.1.0 Launch: Composability

LangChain v0.1.0 Launch: Composability

LangChain v0.1.0 Launch: Streaming

LangChain v0.1.0 Launch: Streaming

LangChain v0.1.0 Launch: Output Parsing

LangChain v0.1.0 Launch: Output Parsing

LangChain v0.1.0 Launch: Retrieval

LangChain v0.1.0 Launch: Retrieval

LangChain v0.1.0 Launch: Agents

LangChain v0.1.0 Launch: Agents

Build and Deploy a RAG app with Pinecone Serverless

Build and Deploy a RAG app with Pinecone Serverless

Hosted LangServe + LangChain Templates

Hosted LangServe + LangChain Templates

LangGraph: Intro

LangGraph: Intro

LangGraph: Agent Executor

LangGraph: Agent Executor

LangGraph: Chat Agent Executor

LangGraph: Chat Agent Executor

LangGraph: Human-in-the-Loop

LangGraph: Human-in-the-Loop

LangGraph: Dynamically Returning a Tool Output Directly

LangGraph: Dynamically Returning a Tool Output Directly

LangGraph: Respond in a Specific Format

LangGraph: Respond in a Specific Format

LangGraph: Managing Agent Steps

LangGraph: Managing Agent Steps

LangGraph: Force-Calling a Tool

LangGraph: Force-Calling a Tool

LangGraph: Multi-Agent Workflows

LangGraph: Multi-Agent Workflows

Streaming Events: Introducing a new `stream_events` method

Streaming Events: Introducing a new `stream_events` method

Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve

Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve

Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)

Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)

LangGraph: Persistence

LangGraph: Persistence

This video teaches how to use Sandboxes for Deep Agents to safely execute arbitrary code and bash commands in remote sandboxes, and demonstrates how to use the Deep Agent CLI to execute code in a sandbox. This allows users to build and deploy AI agents with sandboxed code execution, and integrate sandboxes with AI agents.

Key Takeaways

Set up a remote sandbox using a provider such as Runloop, Daytona, or Modal
Configure the sandbox with a GitHub repo and pip packages
Use the Deep Agent CLI to execute code in the sandbox
Create and run a Python script in the sandbox
Create a test and run it in the sandbox
Submit a PR for the code changes

💡 Sandboxes for Deep Agents allow users to safely execute arbitrary code and bash commands in remote sandboxes, enabling the building and deployment of AI agents with sandboxed code execution.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Agent Foundations

View skill →

Build and Deploy an Agent with Reasoning Engine in Vertex AI

Adding a Phone Gateway to a Virtual Agent

From Zero to Working AI Agent in 60 Seconds

From Zero to Working AI Agent in 60 Seconds

Create An AI Agent With Replit That Automates Your Sales

Create An AI Agent With Replit That Automates Your Sales

Capstone: Autonomous Runway Detection for IoT

Capstone: Autonomous Runway Detection for IoT

AI Agents with Model Context Protocol & Typescript

AI Agents with Model Context Protocol & Typescript

Related Reads

No AI Model Passes the Real-Time Teamwork Test: GPTNT Benchmark Results

No AI model passes the GPTNT benchmark test, which evaluates real-time teamwork and communication, highlighting the limitations of current AI models

Anthropic Launches Claude Science: AI Becomes a Scientific Instrument

Anthropic's Claude Science uses AI to analyze complex scientific problems, like protein folding, in real-time, revolutionizing research efficiency

I built a drop-in AI chatbot widget for React that works with any provider — here's why

Learn how to create a drop-in AI chatbot widget for React that works with any provider, avoiding vendor lock-in and repetitive development

AI Agents Are Now Handling Java Migrations - Here's What That Means for You

AI agents can now handle Java migrations, reducing costs and time, and it's crucial to understand the implications for software development and maintenance

Building Great Agent Skills: The Missing Manual