Code a Vision LLM Agent that plays GeoGuessr using your PC (GPT-4o, Claude 3.5, and Gemini 1.5)

Enric Domingo - AI Engineering · Beginner ·🧠 Large Language Models ·1y ago
How to code an AI bot that plays autonomously the GeoGuessr game using Multimodal Vision LLMs that take screenshots of the game with Python + LangChain and auto click into the mini-map region taking control of the mouse (no human interaction needed!). Code: https://github.com/enricd/geoguessr_ai_bot Blog: https://medium.com/@enricdomingo/coding-a-geoguessr-autonomous-ai-bot-with-vision-llms-gpt-4o-claude-3-5-and-gemini-1-5-908faf3bc3c7 GeoGuessr Game: https://geoguessr.com Poker AI Bot Video: https://www.youtube.com/watch?v=xb88cPyeNe0 In this tutorial, we walk through the process of develop…
Watch on YouTube ↗ (saves to browser)
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)