Evaluating AI Systems | Trends in AI - May 2025

Name: Evaluating AI Systems | Trends in AI - May 2025
Uploaded: 2025-05-10T04:08:53+00:00
Channel: Zeta Alpha
Description: Join us for the Zeta Alpha "Trends in AI" webinar on Friday, May 9th at 8 AM PST / 5 PM CEST, live from LAB42 in Amsterdam, and online from San Francisc...

Zeta Alpha · Advanced ·🧠 Large Language Models ·10mo ago

Join us for the Zeta Alpha "Trends in AI" webinar on Friday, May 9th at 8 AM PST / 5 PM CEST, live from LAB42 in Amsterdam, and online from San Francisco and around the globe. This month, we'll cover everything related to AI evaluations - from public benchmarking of LLMs and relevance metrics for RAG to popular evaluation libraries and the nuances of using the LLM-as-a-Judge approach for automated, continuous assessment of AI system performance. As always, we'll discuss recent model releases like Gemini 2.5 Flash, GPT-4.1, o3 & o4-mini, Qwen 3, and more, along with the most notable developmen…

Watch on YouTube ↗ (saves to browser)

Next Up

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)