Anthropic Computer Use - Hands On Tutorial

Sam Witteveen · Intermediate ·🧠 Large Language Models ·1y ago

Key Takeaways

The video demonstrates hands-on use of Anthropic computer use models and tools, including setting up a Docker container, customizing system prompts, and interacting with the computer using the Anthropic interface. Tools such as Docker, Bash, and Text Editor are utilized to access and utilize the Anthropic key, with applications in agent-based systems and reinforcement learning.

Full Transcript

okay since I recorded the first video about the computer use on anthropic they've now released code and details of how to do that and in this video I'm going to show you just a quick walk through of how you can get this set up a little bit about how it works stuff like that as we go through it so the docs are up for computer use you can see there's quite a strong warning and I will suggest at the start probably best not to run this like on your main computer unless you're going to run a containerized version which is what I'm going to show you in here but there's some good things in here about how it basically works and stuff like that you can see we're just instantiating the anthropic client just like normal but one of the key things that we're doing is we're passing in these tools for a computer with a display size so that it can scale where to click and stuff like that we've got a text editor and we' also got bash that we can use so these are tools that they've created to be able to work with this and we can see that they've got a nice explanation of how the computer use Works in here that Claude will decide which tool to use and then give a response back and then you'll basically use that tool on your virtual machine to get the results out now the way it works is it's doing screenshots to see what's on the screen and doing things like that if you watch the agent s video one of the things that it does seem that they're lacking is they're not using any of the accessibility features or anything they're just using display features or just using screenshots in this case so like I mentioned before they're heavily on you know emphasizing that you probably should go with a containerized version of this be careful if you're going to run it on your local machine I don't suggest you start out doing like that I could imagine that you will end up customizing containers to have exactly what you want running in them and then be able to run it at as you go through that so they've got a nice repo that they've given here with the the demo code and stuff like that so we can come in we can see the whole idea of using it as a loop an agentic kind of loop is key in that you want it to basically get a screenshot send to the model get an updated action for the tool Etc take that action then get a new screenshot and that requires a loop going on in here if we jump into this we can also see that they've got a very specific system prompt going on in here as well so you could I I guess run this on sort of virtual Windows machines in the cloud and in that case you'd want to change the system prompt and the other thing too is you can change the system prompt a little bit to customize it to the particular tasks that you want to use this for if we come in here and look at the tools we can actually see that it can take actions of key type Mouse move left click left click drag right click middle click double cck click there's a whole bunch of things in there which we would expect to see and they've got some logic in there for dealing with different screen sizes Etc now coming down to show you how I'm going to use it in here again we've got more cautions about using a dedicated machine using something like this we're going to basically run it with this Docker container so the cool thing is they've made a Docker container where you can just take your anthropic key put it in there and then run this Docker container and it will access it for you you can also do this on bedrock and you can also do this on gcp or on vertex AI in here and if you are doing it on those you will basically want to connect in to access this because we're doing it locally I'm just going to use Local Host 880 in here okay so I've copied over you know that code for setting up with Docker in here and you'll be able to see that okay it it I'm going to basically start off by setting the key so I could run this as a file but I'm just going to run it in here um I'm going to kill the key obviously after the video just showing you that okay first off you just want to you know export to set up your key for this and then you just very simply want to run the docker container now I've already run it so it won't need to be pulled down again if you're running it the first time it will will actually pull down all of the docker containers and the files needed Etc to to do this here we can see it's just basically starting up and going if I come in here and look at my Docker desktop you can see that I'm using a laptop at the moment so I didn't have Docker on this I've just installed it one of the easiest ways to do that is just to go to Docker desktop it's shows me that I've basically brought in these images already and I've just started that container so this is the container all up and running for us to use now to use it I basically just launch Local Host 8080 and sure enough you'll see that now we've got the clawed interface set up here and I can come in and type different things okay so you can see I can do a search for something like find me the docs for anthropic computer use let's see if it will do this so it's going to start off by taking screenshot the screenshot that we've got nothing open on here it will then decide to do something and it will do you know a few different things okay it's opened up Firefox in this case now one of the places that it does seem to get stuck occasionally is if you've got popups that it doesn't know about I'm currently traveling in Spain so some of the things come up in Spanish and that seems to perhaps confuse it a little bit okay so in this case it's basically hasn't found what we wanted let's see how it recovers from the 404 I guess I should point out that at this point that I also find it a pain to often find the anthropic docks I think they're not under there's no link on the homepage to find them easily ah now it's gone to to docs. anthropic okay so you can see that it got to the page I found the docs didn't find the computer use docks and then finally we got rate limited so it actually had to stop working because of that just quickly to show you in here you have you your your setup so you can change your anthropic key Etc you can allow it to send number of images or you can have a custom system prompt Etc as you go through this and can reset it to this open up bash and see how big is my hard drive okay so now I'm going to try to do something that's not a browser element and by the way you up here here you can basically toggle things on so you could actually set up some of the things on the computer and stuff okay now I'm asking it open gedit which is the text editor here and write me a sonnet about anthropics goals so let's see how okay so it's opened up the app uh that we can see here it's taking a screenshot and we can see sure enough there it's writing out our little Sonet and we've got a rate limit again there uh stopping us but let's see please fix the indenting so that each line of the Sonet has a new line okay we can see that that didn't work out very well but it was able to open this was able to save it you can see that okay it's decided to call it the anthropic sonnet and save it in there so it's pretty interesting to look at how could we use this for various different tasks both running sort of a local container on our machine but also running a container in the cloud that you could imagine has access to other coded agents and that it just pings the anthropic one every now and then to make some decisions about things as well all right so I'll leave it there for the video I'll put the links below so that you can get you know started with this I may do a custom video on the patreon about how to customize the docker container and look at different things you could do with that as always if you've got any questions or comments please put them in the comments below if you found the video useful please click like And subscribe and I will talk to you in the next video bye for now

Original Description

In this video, I go through hands-on how to use the Anthropic computer use models and tools. Explain how they work and also show how you can get it started with Docker on your own computer. For more tutorials on using LLMs and building agents, check out my Patreon Patreon: https://www.patreon.com/SamWitteveen Twitter: https://twitter.com/Sam_Witteveen Computer Use: https://www.anthropic.com/news/developing-computer-use Computer Use Docs: https://docs.anthropic.com/en/docs/build-with-claude/computer-use 👨‍💻Github: https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo 🕵️ Interested in building LLM Agents? Fill out the form below Building LLM Agents Form: https://drp.li/dIMes ⏱️Time Stamps: 00:00 Intro 00:17 Anthropic Docs 02:02 Anthropic Github Repo 03:50 Docker Setup 05:10 Testing Demo
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Sam Witteveen · Sam Witteveen · 0 of 60

← Previous Next →
1 LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab
LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab
Sam Witteveen
2 LangChain Basics Tutorial #2 Tools and Chains
LangChain Basics Tutorial #2 Tools and Chains
Sam Witteveen
3 ChatGPT API Announcement & Code Walkthrough with LangChain
ChatGPT API Announcement & Code Walkthrough with LangChain
Sam Witteveen
4 Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference
Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference
Sam Witteveen
5 LangChain - Conversations with Memory (explanation & code walkthrough)
LangChain - Conversations with Memory (explanation & code walkthrough)
Sam Witteveen
6 LangChain Chat with Flan20B
LangChain Chat with Flan20B
Sam Witteveen
7 LangChain - Using Hugging Face Models locally (code walkthrough)
LangChain - Using Hugging Face Models locally (code walkthrough)
Sam Witteveen
8 PAL : Program-aided Language Models with LangChain code
PAL : Program-aided Language Models with LangChain code
Sam Witteveen
9 Building a Summarization System with LangChain and GPT-3 - Part 1
Building a Summarization System with LangChain and GPT-3 - Part 1
Sam Witteveen
10 Building a Summarization System with LangChain and GPT-3 - Part 2
Building a Summarization System with LangChain and GPT-3 - Part 2
Sam Witteveen
11 Microsoft's Visual ChatGPT using LangChain
Microsoft's Visual ChatGPT using LangChain
Sam Witteveen
12 Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo
Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo
Sam Witteveen
13 LangChain Agents - Joining Tools and Chains with Decisions
LangChain Agents - Joining Tools and Chains with Decisions
Sam Witteveen
14 Investigating Alpaca 7B - Finetuned LLaMa LLM
Investigating Alpaca 7B - Finetuned LLaMa LLM
Sam Witteveen
15 Comparing LLMs with LangChain
Comparing LLMs with LangChain
Sam Witteveen
16 Running Alpaca7B in Colab
Running Alpaca7B in Colab
Sam Witteveen
17 How to finetune your own Alpaca 7B
How to finetune your own Alpaca 7B
Sam Witteveen
18 How to make a custom dataset like Alpaca7B
How to make a custom dataset like Alpaca7B
Sam Witteveen
19 Understanding Constitutional AI - the paper and key concepts
Understanding Constitutional AI - the paper and key concepts
Sam Witteveen
20 Using Constitutional AI in LangChain
Using Constitutional AI in LangChain
Sam Witteveen
21 Talking to Alpaca with LangChain - Creating an Alpaca Chatbot
Talking to Alpaca with LangChain - Creating an Alpaca Chatbot
Sam Witteveen
22 Text-to-video-synthesis with Diffusers and Colab
Text-to-video-synthesis with Diffusers and Colab
Sam Witteveen
23 Meet Dolly the new Alpaca model
Meet Dolly the new Alpaca model
Sam Witteveen
24 Checking out the Cerebras-GPT family of models
Checking out the Cerebras-GPT family of models
Sam Witteveen
25 A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)
A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)
Sam Witteveen
26 Is GPT4All your new personal ChatGPT?
Is GPT4All your new personal ChatGPT?
Sam Witteveen
27 Raven - RWKV-7B RNN's LLM Strikes Back
Raven - RWKV-7B RNN's LLM Strikes Back
Sam Witteveen
28 Talk to your CSV & Excel with LangChain
Talk to your CSV & Excel with LangChain
Sam Witteveen
29 Vicuna - 90% of ChatGPT quality by using a new dataset?
Vicuna - 90% of ChatGPT quality by using a new dataset?
Sam Witteveen
30 Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍
Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍
Sam Witteveen
31 Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)
Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)
Sam Witteveen
32 BabyAGI: Discover the Power of Task-Driven Autonomous Agents!
BabyAGI: Discover the Power of Task-Driven Autonomous Agents!
Sam Witteveen
33 Auto-GPT - How to Automate a Task Based AI with GPT-4
Auto-GPT - How to Automate a Task Based AI with GPT-4
Sam Witteveen
34 Improve your BabyAGI with LangChain
Improve your BabyAGI with LangChain
Sam Witteveen
35 Generative Agents - Deep Dive and GPT-4 Recreation
Generative Agents - Deep Dive and GPT-4 Recreation
Sam Witteveen
36 GPT4ALLv2: The Improvements and Drawbacks You Need to Know!
GPT4ALLv2: The Improvements and Drawbacks You Need to Know!
Sam Witteveen
37 Dolly 2.0 by Databricks: Open for Business but is it  Ready to Impress!
Dolly 2.0 by Databricks: Open for Business but is it Ready to Impress!
Sam Witteveen
38 Red Pajama - Operation: Freeing LLaMA
Red Pajama - Operation: Freeing LLaMA
Sam Witteveen
39 Investigating Open Assistant - Models, Datasets and Addons
Investigating Open Assistant - Models, Datasets and Addons
Sam Witteveen
40 Investigating MiniGPT-4 - The Secret behind GPT-V?
Investigating MiniGPT-4 - The Secret behind GPT-V?
Sam Witteveen
41 Stable LM 3B - The new tiny kid on the block.
Stable LM 3B - The new tiny kid on the block.
Sam Witteveen
42 Bard can now code and put that code in Colab for you.
Bard can now code and put that code in Colab for you.
Sam Witteveen
43 Checking out Bark: a Text to Speech system by Suno AI
Checking out Bark: a Text to Speech system by Suno AI
Sam Witteveen
44 Fine-tuning LLMs with PEFT and LoRA
Fine-tuning LLMs with PEFT and LoRA
Sam Witteveen
45 Master PDF Chat with LangChain - Your essential guide to queries on documents
Master PDF Chat with LangChain - Your essential guide to queries on documents
Sam Witteveen
46 Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools
Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools
Sam Witteveen
47 Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)
Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)
Sam Witteveen
48 StableVicuna: The New King of Open ChatGPTs?
StableVicuna: The New King of Open ChatGPTs?
Sam Witteveen
49 WizardLM: Evolving Instruction Datasets to Create a Better Model
WizardLM: Evolving Instruction Datasets to Create a Better Model
Sam Witteveen
50 LaMini-LM - Mini Models Maxi Data!
LaMini-LM - Mini Models Maxi Data!
Sam Witteveen
51 Finding the Best Free ChatGPT
Finding the Best Free ChatGPT
Sam Witteveen
52 MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model
MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model
Sam Witteveen
53 LangChain Retrieval QA Over Multiple Files with ChromaDB
LangChain Retrieval QA Over Multiple Files with ChromaDB
Sam Witteveen
54 LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs
LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs
Sam Witteveen
55 LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!
LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!
Sam Witteveen
56 Transformers Agent - Is this Hugging Face's LangChain Competitor?
Transformers Agent - Is this Hugging Face's LangChain Competitor?
Sam Witteveen
57 StarCoder - The LLM to make you a coding star?
StarCoder - The LLM to make you a coding star?
Sam Witteveen
58 Testing Starcoder for Reasoning with PAL
Testing Starcoder for Reasoning with PAL
Sam Witteveen
59 The New Wizards - Unfiltered & Unaligned
The New Wizards - Unfiltered & Unaligned
Sam Witteveen
60 Camel + LangChain for Synthetic Data & Market Research
Camel + LangChain for Synthetic Data & Market Research
Sam Witteveen

This video teaches viewers how to use Anthropic computer use models and tools, including setting up a Docker container and customizing system prompts. The tutorial covers hands-on applications of Anthropic LLMs in agent-based systems and reinforcement learning. By following the steps outlined in the video, viewers can build custom system prompts, deploy Anthropic models using Docker, and integrate Anthropic LLMs with other tools.

Key Takeaways
  1. Run a Docker container to access the Anthropic key
  2. Set up the Docker container with the key
  3. Customize the system prompt for different tasks
  4. Use the feature in a loop to get a screenshot, send it to the model, and take an action
  5. Launch Local Host 8080 to access the Anthropic interface
  6. Search for documents and interact with the computer using the Anthropic interface
💡 The Anthropic computer use feature can be customized and integrated with other tools using Docker containers and system prompts, enabling a wide range of applications in agent-based systems and reinforcement learning.

Related AI Lessons

The 2026 AI Model Release Race: Every Major LLM Launch You Need to Know
Stay updated on the 2026 AI model release race, including major LLM launches like Claude Sonnet 5 and GPT-5.6, to leverage the latest advancements in AI technology
Dev.to AI
Call GPT, Claude, and Gemini from one API key — a 3-step setup
Access GPT, Claude, and Gemini through one API key with a 3-step setup using Modelishub
Dev.to AI
Your LLM Doesn’t Pick Stocks — It Remembers Them
Discover how LLMs remember stock picks rather than making actual predictions, and why this matters for AI-driven investment strategies
Medium · Machine Learning
Word Representation
Learn how word representation works in NLP and its importance in understanding human language, enabling applications like text classification and language translation
Medium · NLP

Chapters (5)

Intro
0:17 Anthropic Docs
2:02 Anthropic Github Repo
3:50 Docker Setup
5:10 Testing Demo
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →