Execute code with sandboxes for Deep Agents

LangChain · Intermediate ·🤖 AI Agents & Automation ·7mo ago

Key Takeaways

The video introduces Sandboxes for Deep Agents, a new set of integrations that allow users to safely execute arbitrary code and bash commands in remote sandboxes, using providers such as Runloop, Daytona, and Modal, and demonstrates how to use the Deep Agent CLI to execute code in a sandbox.

Full Transcript

Hey, I'm VC and in this video I'm excited to introduce sandboxes for deep agents. We're going to talk about what these are and why you might want to use them in developing your deep agents. So, a common thing that you might do is you might have your local machine that's running your deep agent. And a common ask that we hear is you want to safely run the code that your agent is generating, but you don't want to mess up the machine that you're working on because the the agent could be generating arbitrary code. So what do we do to fix that? Well, the first thing we might do is we want to figure out what we want our sandbox to have. So what we might do is we'll say, hey, in my sandbox, I'm working on this GitHub repo. So I want to pull that down and I want to install these custom packages maybe something from pip. So now we have a remote sandbox. Today we support three three providers for our sandboxes. Those are run loop, Daytona and modal. Great. So now we have our local machine running our deep agent and we've connected to some remote sandbox. But what actually lives in that sandbox? Really it's two things. The first thing is that the sandbox has a file system similar to your local file system where you can create files, you can edit them, you can store them. And the other thing is an execute tool. So this functions as a remote shell. So you can run all of your generated code in the remote sandbox without having to worry about things getting messed up on your local machine. Great. So now that we've set up the sandbox, how do we actually use it? Well, the way that it works is by exposing a tool call to execute the code. So on your local machine when you're chatting with your deep agent, um you might tell it to execute some code. What that actually does is it takes that it takes that command and then it goes to the remote sandbox and uses the execute tool. So that will run the bash tool in the remote shell. For example, you want to run some Python script. So it'll go and run this over here in the remote sandbox. And then the third step is it'll take the output from that script and then we'll send it back to your deep agent running here. So that way your deep agent can operate this loop where it does a tool call. That tool call gets executed in the remote sandbox, but it can always see all the outputs and make a decision on what to do next alongside you. Great. So this is a sort of a diagram of how it works. Let's dive into a code example. We're going to be going over an example using the deep agent CLI. So, the first thing I wanted to show you was what's an example of some of the stuff that might live in that setup script that we talked about before. So, this has a ton of stuff in it. The main thing that I really want you to take a look at is that all I really want to do is pass in my GitHub token and then pull down one of the repos I've been working on. For us, we're going to be doing some work in the deep agents repo which lives in my GitHub. Great. Once we do that, we can launch [snorts] the deep agent CLA. So you'll see a few things happen as this is running. The first thing that we see is that it tells us the ID of the run loop sandbox that's created. The other thing it says is the setup script that I specified completed successfully. So now what we can do is we can go do some work. So I'll go off and do that and then show you what happened. Okay. So let's go and review some of the work that I did with the deep agent in the sandbox. The first thing I did was I gave it a task which was go and read the deep agents folder and go and read the read me in there. Just make sure you understand the project. And then the task I actually gave it was I was trying to test out creating a new tool for a deep agent. And what I wanted to do was have it create the tool, create a sample for it, and then actually run the tool. So here we have create a sample tool Python script. Uh I give it the specification which is take in a JSON file, return all the top level keys and then also create a test and then run that tool with Python. So after I give it this task, you can see it goes and it starts doing work in the sandbox. So it goes it lists some of its memories. So those live on my local machine and then what it goes and does is it reads the DB agent's folder that I pulled down from git reads the readme and then what it starts to do is it starts to create the file that I told it to which is JSON keys tool.py. We can see the diff so I can always go and review what it's creating. Um it looked good to me. And then what I had to do was also create a test. So test sample.json. Remember all of this is happening in the sandbox file system. Great. So then what I had to do was go and actually execute that command. So again this is happening in the sandbox using the execute tool. So then it goes it runs that command and then it tells us the location is in the remote sandbox. It tells us that hey I went I read the deep agents readme created the tool and then I ran it and it overall it gives us the summary of the files that I created. Finally, what I had it do was go and just uh submit a PR for this. You can kind of do anything you want. You can um push it somewhere else. Uh this can just be sort of test code that you run, but again, all of this runs in the sandbox, so you can safely execute code there. So, I hope that was a fun and useful demo of how you can use sandboxes to both execute code safely and do real work with your deep agents. As you can see here, that PR that we made, it's here. I can compare and PR it. If you want to learn more or just get straight into building, check out our deep agents repo. We're really excited to see some of the cool stuff that you build. Until next time, thanks.

Original Description

We're excited to launch Sandboxes for DeepAgents, a new set of integrations that allow you to safely execute arbitrary code and bash commands in remote sandboxes. Your DeepAgent runs locally (or wherever you want), but when it needs to execute code, create files, or run commands, those operations happen in the remote sandbox. - Learn more: http://blog.langchain.com/execute-code-with-sandboxes-for-deepagents/ - See the docs: https://github.com/langchain-ai/deepagents - Learn how to build Deep Agents on LangChain Academy: https://academy.langchain.com/courses/deep-agents-with-langgraph/?utm_medium=social&utm_source=youtube&utm_campaign=q4-2025_youtube-academy-links_aw - Observe, evaluate, and deploy agents with LangSmith: https://smith.langchain.com/?utm_medium=social&utm_source=youtube&utm_campaign=q4-2025_youtube-links_aw
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from LangChain · LangChain · 0 of 60

← Previous Next →
1 Chat With Your Documents Using LangChain + JavaScript
Chat With Your Documents Using LangChain + JavaScript
LangChain
2 LangChain SQL Webinar
LangChain SQL Webinar
LangChain
3 LangChain "OpenAI functions" Webinar
LangChain "OpenAI functions" Webinar
LangChain
4 LangSmith Launch
LangSmith Launch
LangChain
5 LangChain x Pinecone: Supercharging Llama-2 with RAG
LangChain x Pinecone: Supercharging Llama-2 with RAG
LangChain
6 LangChain Expression Language
LangChain Expression Language
LangChain
7 Building LLM applications with LangChain with Lance
Building LLM applications with LangChain with Lance
LangChain
8 Benchmarking Question/Answering Over CSV Data
Benchmarking Question/Answering Over CSV Data
LangChain
9 LangChain "RAG Evaluation" Webinar
LangChain "RAG Evaluation" Webinar
LangChain
10 Fine-tuning in Your Voice Webinar
Fine-tuning in Your Voice Webinar
LangChain
11 Tabular Data Retrieval
Tabular Data Retrieval
LangChain
12 Building an LLM Application with Audio by AssemblyAI
Building an LLM Application with Audio by AssemblyAI
LangChain
13 Superagent Deepdive Webinar
Superagent Deepdive Webinar
LangChain
14 Lessons from Deploying LLMs with LangSmith
Lessons from Deploying LLMs with LangSmith
LangChain
15 Shortwave Assistant Deepdive Webinar
Shortwave Assistant Deepdive Webinar
LangChain
16 Cognitive Architectures for Language Agents
Cognitive Architectures for Language Agents
LangChain
17 Effectively Building with LLMs in the Browser with Jacob
Effectively Building with LLMs in the Browser with Jacob
LangChain
18 Data Privacy for LLMs
Data Privacy for LLMs
LangChain
19 "Theory of Mind" Webinar with Plastic Labs
"Theory of Mind" Webinar with Plastic Labs
LangChain
20 LangChain Templates
LangChain Templates
LangChain
21 Using Natural Language to Query Postgres with Jacob
Using Natural Language to Query Postgres with Jacob
LangChain
22 Building a Research Assistant from Scratch
Building a Research Assistant from Scratch
LangChain
23 Benchmarking RAG over LangChain Docs
Benchmarking RAG over LangChain Docs
LangChain
24 Skeleton-of-Thought: Building a New Template from Scratch
Skeleton-of-Thought: Building a New Template from Scratch
LangChain
25 Benchmarking Methods for Semi-Structured RAG
Benchmarking Methods for Semi-Structured RAG
LangChain
26 LangSmith Highlights: Getting Started
LangSmith Highlights: Getting Started
LangChain
27 LangSmith Highlights: Debugging
LangSmith Highlights: Debugging
LangChain
28 LangSmith Highlights: Datasets
LangSmith Highlights: Datasets
LangChain
29 LangSmith Highlights: Evaluation
LangSmith Highlights: Evaluation
LangChain
30 LangSmith Highlights: Human Annotation
LangSmith Highlights: Human Annotation
LangChain
31 LangSmith Highlights: Monitoring
LangSmith Highlights: Monitoring
LangChain
32 LangSmith Highlights: Hub
LangSmith Highlights: Hub
LangChain
33 SQL Research Assistant
SQL Research Assistant
LangChain
34 Getting Started with Multi-Modal LLMs
Getting Started with Multi-Modal LLMs
LangChain
35 Build a Full Stack RAG App With TypeScript
Build a Full Stack RAG App With TypeScript
LangChain
36 Auto-Prompt Builder (with Hosted LangServe)
Auto-Prompt Builder (with Hosted LangServe)
LangChain
37 LangChain v0.1.0 Launch: Introduction
LangChain v0.1.0 Launch: Introduction
LangChain
38 LangChain v0.1.0 Launch: Observability
LangChain v0.1.0 Launch: Observability
LangChain
39 LangChain v0.1.0 Launch: Integrations
LangChain v0.1.0 Launch: Integrations
LangChain
40 LangChain v0.1.0 Launch: Composability
LangChain v0.1.0 Launch: Composability
LangChain
41 LangChain v0.1.0 Launch: Streaming
LangChain v0.1.0 Launch: Streaming
LangChain
42 LangChain v0.1.0 Launch: Output Parsing
LangChain v0.1.0 Launch: Output Parsing
LangChain
43 LangChain v0.1.0 Launch: Retrieval
LangChain v0.1.0 Launch: Retrieval
LangChain
44 LangChain v0.1.0 Launch: Agents
LangChain v0.1.0 Launch: Agents
LangChain
45 Build and Deploy a RAG app with Pinecone Serverless
Build and Deploy a RAG app with Pinecone Serverless
LangChain
46 Hosted LangServe + LangChain Templates
Hosted LangServe + LangChain Templates
LangChain
47 LangGraph: Intro
LangGraph: Intro
LangChain
48 LangGraph: Agent Executor
LangGraph: Agent Executor
LangChain
49 LangGraph: Chat Agent Executor
LangGraph: Chat Agent Executor
LangChain
50 LangGraph: Human-in-the-Loop
LangGraph: Human-in-the-Loop
LangChain
51 LangGraph: Dynamically Returning a Tool Output Directly
LangGraph: Dynamically Returning a Tool Output Directly
LangChain
52 LangGraph: Respond in a Specific Format
LangGraph: Respond in a Specific Format
LangChain
53 LangGraph: Managing Agent Steps
LangGraph: Managing Agent Steps
LangChain
54 LangGraph: Force-Calling a Tool
LangGraph: Force-Calling a Tool
LangChain
55 LangGraph: Multi-Agent Workflows
LangGraph: Multi-Agent Workflows
LangChain
56 Streaming Events: Introducing a new `stream_events` method
Streaming Events: Introducing a new `stream_events` method
LangChain
57 Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve
Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve
LangChain
58 OpenGPTs
OpenGPTs
LangChain
59 Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)
Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)
LangChain
60 LangGraph: Persistence
LangGraph: Persistence
LangChain

This video teaches how to use Sandboxes for Deep Agents to safely execute arbitrary code and bash commands in remote sandboxes, and demonstrates how to use the Deep Agent CLI to execute code in a sandbox. This allows users to build and deploy AI agents with sandboxed code execution, and integrate sandboxes with AI agents.

Key Takeaways
  1. Set up a remote sandbox using a provider such as Runloop, Daytona, or Modal
  2. Configure the sandbox with a GitHub repo and pip packages
  3. Use the Deep Agent CLI to execute code in the sandbox
  4. Create and run a Python script in the sandbox
  5. Create a test and run it in the sandbox
  6. Submit a PR for the code changes
💡 Sandboxes for Deep Agents allow users to safely execute arbitrary code and bash commands in remote sandboxes, enabling the building and deployment of AI agents with sandboxed code execution.

Related Reads

📰
No AI Model Passes the Real-Time Teamwork Test: GPTNT Benchmark Results
No AI model passes the GPTNT benchmark test, which evaluates real-time teamwork and communication, highlighting the limitations of current AI models
Dev.to AI
📰
Anthropic Launches Claude Science: AI Becomes a Scientific Instrument
Anthropic's Claude Science uses AI to analyze complex scientific problems, like protein folding, in real-time, revolutionizing research efficiency
Dev.to AI
📰
I built a drop-in AI chatbot widget for React that works with any provider — here's why
Learn how to create a drop-in AI chatbot widget for React that works with any provider, avoiding vendor lock-in and repetitive development
Dev.to AI
📰
AI Agents Are Now Handling Java Migrations - Here's What That Means for You
AI agents can now handle Java migrations, reducing costs and time, and it's crucial to understand the implications for software development and maintenance
Dev.to AI
Up next
Building Great Agent Skills: The Missing Manual
AI Engineer
Watch →