Let's deploy a custom AI model container as an autoscaling API on your private cloud in 10 minutes
I show how to deploy a container already built and published to any container registry as an autoscaling API for AI models. The server can scale to zero (serverless) when idle (ie: zero $) and scale up on demand as requests come in.
00:30 - Find the Deploy API button
00:43 - find your private or public container.
00:52 - authenticate the container
01:11 - configure autoscaling (serverless, scale to zero)
01:30 - configure autoscaling to handle high traffic
01:45 - advanced settings (for kubernetes experts)
01:57 - choose the cloud account
02:30 - add container commands
02:48 - add a health ch…
Watch on YouTube ↗
(saves to browser)
Chapters (19)
0:30
Find the Deploy API button
0:43
find your private or public container.
0:52
authenticate the container
1:11
configure autoscaling (serverless, scale to zero)
1:30
configure autoscaling to handle high traffic
1:45
advanced settings (for kubernetes experts)
1:57
choose the cloud account
2:30
add container commands
2:48
add a health check (optional)
3:00
configure autoscaling rules (optional)
3:20
deploy the model
3:40
how to share (and collaborate)
4:00
how to monitor requests, uptime, latency
4:05
analyze cold start
5:00
view audit events
5:20
release a new container version
5:50
view replica telemetry
6:22
build a request to test the server
7:20
summary, get a free account with pay as you go
DeepCamp AI