Model Serving Systems: Containers, APIs & Scalability

External: Coursera Courses ↗ · Coursera

Open Course on External: Coursera

Free to audit · Opens on External: Coursera

Model Serving Systems: Containers, APIs & Scalability

Coursera · Beginner ·🏭 MLOps & LLMOps ·1mo ago

Skills: Model Deployment90%API Design80%

Key Takeaways

Deploys ML models using Docker containers, FastAPI, and ONNX for scalable model serving

Original Description

"Docker and Model Serving: Deploy ML APIs with FastAPI and ONNX is designed for ML engineers, MLOps practitioners, and backend developers who want to take models from notebooks to production. You'll learn to build Docker containers for ML workloads, design scalable REST APIs with FastAPI, serialize models with ONNX and SavedModel, and deploy with zero-downtime strategies like blue-green and canary releases. The first module covers Docker fundamentals, image optimization, multi-stage builds, secrets management, and Docker Compose for multi-container ML apps. The second module focuses on REST API design with FastAPI, model versioning, input validation with Pydantic, structured logging, and production-grade error handling. The third module teaches scaling strategies — horizontal scaling, async queues, load balancing, batch vs. real-time inference, and latency optimization for high-throughput serving. The final module covers model serialization formats (ONNX, pickle, SavedModel), blue-green and canary deployments, automated rollback, and disaster recovery. By the end of this course, you will: - Build and optimize Docker images for ML models using multi-stage builds and Compose - Design scalable FastAPI endpoints with versioning, validation, and observability - Scale ML inference with async queues, load balancing, and latency optimization - Deploy models with ONNX serialization and zero-downtime blue-green rollbacks"

Watch on External: Coursera ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: Model Deployment

View skill →

Tutorial 11- How To Deploy End To End ML Projects In Production AWS Cloud Using CI CD Pipeline

Tutorial 11- How To Deploy End To End ML Projects In Production AWS Cloud Using CI CD Pipeline

Use Amazon SageMaker with PyTorch (Hebrew)

Use Amazon SageMaker with PyTorch (Hebrew)

Automate, Evaluate and Deploy ML Models Confidently

Automate, Evaluate and Deploy ML Models Confidently

Introducing LangSmith Studio and Deployment for LangGraph.js

Introducing LangSmith Studio and Deployment for LangGraph.js

Ryan Herr - After model.fit, before you deploy| JupyterCon 2020

Ryan Herr - After model.fit, before you deploy| JupyterCon 2020

Deploy & Optimize ML Services Confidently

Deploy & Optimize ML Services Confidently

Related Reads

A Phased Blueprint for Migrating From Google Workspace to Microsoft 365

Learn a step-by-step approach to migrate from Google Workspace to Microsoft 365 with minimal downtime and zero data loss, understanding it as an infrastructure engineering challenge

Feature Freshness: The Forgotten Problem of MLOps

Learn how outdated features can cause production models to fail and why feature freshness is crucial in MLOps, to improve model performance and reliability

Day 19 of the 100 Days of MLOps Challenge

Learn to build a complete DVC ML pipeline with remote storage and experiments to streamline your machine learning workflow and improve collaboration

Medium · DevOps

From Critical Infrastructure to AI Factories: Building an AI Operations Copilot on Nebius…

Learn how to build an AI operations copilot by leveraging experience in critical infrastructure and AI-assisted engineering, and why it matters for efficient AI deployment

Pole Pruner How A Rope Lever Shears High Branches

Innoforge Studio