Teaching PyTorch To Read Your Worst PDFs With Docling - Mingxuan Zhao, Peter Staar & Carol Chen
Skills:
RAG Basics90%
Teaching PyTorch To Read Your Worst PDFs With Docling - Mingxuan Zhao & Peter Staar, IBM & Carol Chen, Red Hat
Building production RAG pipelines starts with a problem most teams underestimate: getting clean, structured data out of real-world documents. PDFs lose table structure, figures get separated from captions, and multi-column layouts become unreadable. Before your PyTorch models even see your data, crucial information is already lost.
Docling is an open-source, MIT-licensed document parsing library that uses PyTorch-based deep learning models to understand documents the way humans read them. It preserves hierarchy, extracts structured data from tables and figures, and supports over ten common file formats through a consistent API. Because everything runs locally, it integrates cleanly into PyTorch-native workflows with low latency and no data leaving your infrastructure.
In this talk, I'll walk through Docling's PyTorch-powered architecture and show how to build document processing pipelines for RAG and other GenAI applications. I'll also share the architecture of real-world applications of Docling and how it has improved workflows. You'll leave with practical patterns for connecting Docling to your own PyTorch-based GenAI stack.
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: RAG Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Mental Algorithms: How AI Changes the Cost of Thinking
Dev.to AI
The AI Content System I Built to Generate Viral LinkedIn Posts Started Bringing Clients…
Medium · Programming
$5,000/Month AI Income: Local Business Review Translation Service
Medium · ChatGPT
Gmail's New AI Features Are Live—And They're About to Change What You Actually See
Medium · Programming
🎓
Tutor Explanation
DeepCamp AI