Evaluation of LLM Applications: How Do You Know It Actually Works?

Name: Evaluation of LLM Applications: How Do You Know It Actually Works?
Uploaded: 2026-05-13T21:03:25Z
Channel: Data Science Dojo
Description: Join us for a practical webinar on LLM evaluation frameworks and strategies for measuring the quality, reliability, and performance of AI applications, ...

Data Science Dojo · Beginner ·📄 Research Papers Explained ·6h ago

Skills: RAG Evaluation90%

Join us for a practical webinar on LLM evaluation frameworks and strategies for measuring the quality, reliability, and performance of AI applications, including chatbots, AI agents, and RAG systems. 💡 What we’ll cover: • Hallucinations, prompt sensitivity, and hidden failure modes • Human evaluation vs automated evaluation • Benchmark testing and regression workflows • Evaluating chatbots, AI agents, summarization, and RAG systems • Introduction to RAGAS and key LLM evaluation metrics • Measuring faithfulness, relevance, groundedness, and latency • Monitoring LLM applications in production 🛠 Hands-on exercise included: Participants will evaluate a small LLM/RAG assistant using structured rubrics and compare human evaluation with automated RAGAS scores. Perfect for AI engineers, developers, data scientists, and technical leaders working with LLM applications and AI systems.

Watch on YouTube ↗ (saves to browser)