Model Serving Systems: Containers, APIs & Scalability

External: Coursera Courses ↗ · Coursera

Open Course on External: Coursera

Free to audit · Opens on External: Coursera

Model Serving Systems: Containers, APIs & Scalability

Coursera · Beginner ·🏭 MLOps & LLMOps ·1mo ago

Key Takeaways

Deploys ML models using Docker containers, FastAPI, and ONNX for scalable model serving

Original Description

"Docker and Model Serving: Deploy ML APIs with FastAPI and ONNX is designed for ML engineers, MLOps practitioners, and backend developers who want to take models from notebooks to production. You'll learn to build Docker containers for ML workloads, design scalable REST APIs with FastAPI, serialize models with ONNX and SavedModel, and deploy with zero-downtime strategies like blue-green and canary releases. The first module covers Docker fundamentals, image optimization, multi-stage builds, secrets management, and Docker Compose for multi-container ML apps. The second module focuses on REST API design with FastAPI, model versioning, input validation with Pydantic, structured logging, and production-grade error handling. The third module teaches scaling strategies — horizontal scaling, async queues, load balancing, batch vs. real-time inference, and latency optimization for high-throughput serving. The final module covers model serialization formats (ONNX, pickle, SavedModel), blue-green and canary deployments, automated rollback, and disaster recovery. By the end of this course, you will: - Build and optimize Docker images for ML models using multi-stage builds and Compose - Design scalable FastAPI endpoints with versioning, validation, and observability - Scale ML inference with async queues, load balancing, and latency optimization - Deploy models with ONNX serialization and zero-downtime blue-green rollbacks"
Watch on External: Coursera ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related Reads

📰
A Phased Blueprint for Migrating From Google Workspace to Microsoft 365
Learn a step-by-step approach to migrate from Google Workspace to Microsoft 365 with minimal downtime and zero data loss, understanding it as an infrastructure engineering challenge
Hackernoon
📰
Feature Freshness: The Forgotten Problem of MLOps
Learn how outdated features can cause production models to fail and why feature freshness is crucial in MLOps, to improve model performance and reliability
Medium · LLM
📰
Day 19 of the 100 Days of MLOps Challenge
Learn to build a complete DVC ML pipeline with remote storage and experiments to streamline your machine learning workflow and improve collaboration
Medium · DevOps
📰
From Critical Infrastructure to AI Factories: Building an AI Operations Copilot on Nebius…
Learn how to build an AI operations copilot by leveraging experience in critical infrastructure and AI-assisted engineering, and why it matters for efficient AI deployment
Medium · LLM
Up next
Pole Pruner How A Rope Lever Shears High Branches
Innoforge Studio
Watch →