Structured Output from LLMs: Grammars, Regex, and State Machines

Efficient NLP · Beginner ·🧠 Large Language Models ·1y ago
Try Voice Writer - speak your thoughts and let AI handle the grammar: https://voicewriter.io Structured outputs are essential for applications that integrate LLMs to make decisions in downstream tasks. In this video, I explain how structured output generation works - a topic that is very relevant and also an area of active research. First, we look at OpenAI API's ability to produce structured outputs using formats like Pydantic or Zod. For open-source alternatives, I cover the Outlines library, which operates using state machines and regex under the hood. However, in many cases, we need to …
Watch on YouTube ↗ (saves to browser)

Chapters (13)

Introduction
1:06 OpenAI API example
3:02 Outlines library example
4:07 Pydantic to regex conversion
4:57 Finite state machines and regex
5:58 Regex matching with LLMs
8:41 Context free grammars
9:40 Incremental parsing of CFGs
11:22 Pushdown automata
12:18 Token-terminal mismatch problem
14:26 Vocabulary-aligned subgrammars
15:12 State machine composition
16:06 Format restriction and LLM performance
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)