Tech Skills

DevOps & Cloud

CI/CD, Docker, Kubernetes, AWS, GCP, Azure, monitoring and infrastructure as code

1,360
lessons
Skills in this topic
View full skill map →
Linux & CLI
beginner
Navigate the filesystem, manage permissions, and use pipes
Docker & Containers
beginner
Write a production-ready Dockerfile
Cloud Fundamentals
intermediate
Deploy a web app on AWS EC2 or App Engine
Kubernetes
intermediate
Deploy a multi-container app on a k8s cluster
CI/CD Pipelines
intermediate
Build a CI pipeline that runs tests on every PR
Infrastructure as Code
advanced
Provision a full VPC with Terraform
All Reads (907) Articles (468)Blog Posts (321)Tutorials (114)News (4)
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
When Platforms Optimise for Control
How internal platforms drift from enabling engineers to constraining them Continue reading on Medium »
Deploy a Secure Containerized App on Amazon ECS Fargate Using ECR and Secrets Manager
Dev.to · JOSE PRAVEEN ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Deploy a Secure Containerized App on Amazon ECS Fargate Using ECR and Secrets Manager
Deploy a Secure Containerized App on Amazon ECS Fargate Using ECR and Secrets Manager In...
Turning You Into a Power User: Hybrid Memory, SSH Cloak, and Password Vaulting With VEKTOR
Dev.to · Vektor Memory ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Turning You Into a Power User: Hybrid Memory, SSH Cloak, and Password Vaulting With VEKTOR
A 10-minute tutorial that covers how we manage servers, store AES-256 secrets, and maintain...
The Moment the Jaeger Tracer Exhausted Itself and What We Switched To
Dev.to · pretty ncube ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Moment the Jaeger Tracer Exhausted Itself and What We Switched To
The Problem We Were Actually Solving Our treasure-hunt engine at Veltrix was not exploding; it was...
GPU Scheduling in Kubernetes Explained: What Actually Works for AI and High-Performance Workloads
Medium · AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
GPU Scheduling in Kubernetes Explained: What Actually Works for AI and High-Performance Workloads
Running CPU workloads in Kubernetes is easy. Continue reading on AegisOps »
SCS-Lab1 — CloudTrail: Trail + S3 + KMS + Log Validation
Dev.to · Luis Eduardo Lunar Guevara ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
SCS-Lab1 — CloudTrail: Trail + S3 + KMS + Log Validation
Región: us-east-1 Duración estimada: 35–55 minutos Costo-riesgo: Medio Certificación: AWS Certified...
Linux Process Management & Cron Jobs — Monitor, Control, and Automate Like a DevOps Engineer
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Linux Process Management & Cron Jobs — Monitor, Control, and Automate Like a DevOps Engineer
Two skills that keep production servers alive and running on schedule Continue reading on Medium »
RabbitMQ Cluster with Quorum Queues — DevOps Zero to Hero Guide
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
RabbitMQ Cluster with Quorum Queues — DevOps Zero to Hero Guide
RabbitMQ is widely used for communication between applications and microservices. Running RabbitMQ on a single server can create a single… Continue reading on M
9 Commands I Run in the First 3 Minutes of Every Production Incident
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
9 Commands I Run in the First 3 Minutes of Every Production Incident
After 11 years on-call, these are the only ones that matter when the pager wakes me up. Continue reading on AWS in Plain English »
Default Config Is a Slow Config
Dev.to · pretty ncube ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Default Config Is a Slow Config
The Problem We Were Actually Solving Looking back, we were trying to solve the wrong...
Stop Treating Your Code as Immutable: The Art of the Rollback
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Stop Treating Your Code as Immutable: The Art of the Rollback
If you can’t push the “undo” button in under 60 seconds, you aren’t deploying software — you’re playing Russian roulette with your users. Continue reading on St
SSH Login Delays: The 10-Second Wait That Drives Us Crazy
Dev.to · Schiff Heimlich ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
SSH Login Delays: The 10-Second Wait That Drives Us Crazy
The Problem Every sysadmin has been there: you SSH into a server and wait... and wait......
When Nginx Timeouts Weren’t Nginx: Debugging Socket Leak in Production
Medium · Python ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
When Nginx Timeouts Weren’t Nginx: Debugging Socket Leak in Production
A detective story about a silent application killer, the TCP lifecycle, and why your infrastructure metrics might be lying to you. Continue reading on Medium »
Harness Engineering, Part 2: The Architecture of a Modern Delivery System
Medium · LLM ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Harness Engineering, Part 2: The Architecture of a Modern Delivery System
A 10-part series on building software delivery systems that actually work. Today: the six core pieces every harness is built from. Continue reading on Medium »
The Hidden Cost of Downtime: How SRE Error Budgets Protect National Economic Infrastructure
Dev.to · Nijo George Payyappilly ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Hidden Cost of Downtime: How SRE Error Budgets Protect National Economic Infrastructure
At 9:30 AM on August 1, 2012, Knight Capital Group's trading systems began executing a catastrophic...
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Load Balancer — HAProxy
Understanding Load Balancers and HAProxy in Modern Infrastructure Continue reading on Medium »
☸️ Building a Kubernetes Cost Optimization Engine
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
☸️ Building a Kubernetes Cost Optimization Engine
Complete Step-by-Step FinOps + DevOps Walkthrough for Beginners  Continue reading on Medium »
VPC Endpoint Policies:
The Zero-Trust Feature
Nobody Talks About
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
VPC Endpoint Policies: The Zero-Trust Feature Nobody Talks About
You locked down the network path. But did you lock down what’s accessible through it? Continue reading on Technogise »
Dev.to AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Celery tasks retrying twice after Redis timeout
Celery tasks retrying twice after Redis timeout Proof: Celery tasks retrying twice after Redis timeout I completed the help-board response for the request title
Linux Production Servers in 2026: The Brutal Truth After Managing Hundreds of Servers
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Linux Production Servers in 2026: The Brutal Truth After Managing Hundreds of Servers
I’ve spent the last 8 years responsible for Linux servers in production. These are the silent killers that caused the worst outages of my… Continue reading on S
Linux Production Servers: The Silent Killers That Cost Companies Real Money in 2026
Medium · Programming ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Linux Production Servers: The Silent Killers That Cost Companies Real Money in 2026
After managing Linux servers for 8+ years across multiple companies, these are the most painful production incidents I’ve seen — and… Continue reading on Engine
Linux Production Servers: The Silent Killers That Cost Companies Real Money in 2026
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Linux Production Servers: The Silent Killers That Cost Companies Real Money in 2026
After managing Linux servers for 8+ years across multiple companies, these are the most painful production incidents I’ve seen — and… Continue reading on Engine
Redis Production Survival Guide 2026: What I Learned After 9 Years and 47 Incidents
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Redis Production Survival Guide 2026: What I Learned After 9 Years and 47 Incidents
Redis looks simple until it starts costing you money. Here’s the complete playbook I use across every production environment I touch. Continue reading on Stacka
How to Run an Enterprise AWS Stack for $0.00: A Performance-First Serverless Deep-Dive
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
How to Run an Enterprise AWS Stack for $0.00: A Performance-First Serverless Deep-Dive
A comprehensive guide to building a microsecond-latency inventory engine that scales instantly and costs nothing while idle. Continue reading on Medium »
Managing multiple docker hub accounts using docker-use
Dev.to · Chirag Aggarwal ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Managing multiple docker hub accounts using docker-use
Most of the time I'm signed into my work Docker Hub account, and that's fine. Almost everything I...
Dev.to AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Need help troubleshooting understanding a GitHub Actions cache miss pattern in a monorepo
Need help troubleshooting understanding a GitHub Actions cache miss pattern in a monorepo Quest Best Tech-Category Response Original AgentHansa Help Thread Requ
How a Tiny Go Cache Cut Our Redis Bill by $4,000/Month
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
How a Tiny Go Cache Cut Our Redis Bill by $4,000/Month
The Night Redis Started Breaking Everything Continue reading on Medium »
SSH Login Taking Forever? Check Your DNS Settings
Dev.to · Schiff Heimlich ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
SSH Login Taking Forever? Check Your DNS Settings
A simple fix for slow SSH connections caused by DNS lookups
Payment Events at Scale: Building a Robust Kafka Event Bus  for a B2B Payment Platform
Medium · Python ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Payment Events at Scale: Building a Robust Kafka Event Bus for a B2B Payment Platform
 FREE full access on: LovinData — Simplified Full Stack Data Engineering Continue reading on Medium »
Migration of Intercontinental VM (USA Region > Australia Region) using Storage Snapshot through…
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Migration of Intercontinental VM (USA Region > Australia Region) using Storage Snapshot through…
In this real-world based project, I acted as a Cloud Specialist in a project to migrate application and database into an intercontinental… Continue reading on M
Dev.to AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
A Tragic Tale of Mis-scaled Servers and the Unfortunate Rise of the Treasure Hunt Engine
The Problem We Were Actually Solving It's been a year since we rolled out the Treasure Hunt Engine, our flagship product for creating immersive in-game experien
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Linux Guide for DevOps
You have mastered Git, you understand deployment pipelines, and you can confidently package applications using Docker. But there is still… Continue reading on M
Kubernetes Deployment with GitOps and FluxCD
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Kubernetes Deployment with GitOps and FluxCD
In this workshop, we’ll explore how to deploy and manage a Kubernetes cluster using GitOps and FluxCD. In a previous article, I covered… Continue reading on Med
The Modern DevSecOps Engineering Stack (2026 Edition): From First Commit to Production
Dev.to · Aturo Phil ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Modern DevSecOps Engineering Stack (2026 Edition): From First Commit to Production
Here's a hard truth I learnt after watching a production database get wiped by a leaked .env file:...
Docker Is Not What I Thought It Was
Medium · Programming ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Docker Is Not What I Thought It Was
I used Docker for a while before I actually understood what it was doing. I pulled images, ran containers, wrote Dockerfiles, and it all… Continue reading on Me
Dev.to AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
When Treachery Reveals the True Cost of Server Health
The Problem We Were Actually Solving After weeks of digging through logs and monitoring data, I finally figured out the root cause of our problems: our treasure
Dev.to AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Veltrix Operator Nightmare: How I Learned to Stop Worrying and Love the Failures
The Problem We Were Actually Solving I was tasked with integrating the Veltrix treasure hunt engine into our growing server infrastructure, and from the start,
Six Months Ago Kubernetes Retired Ingress NGINX. An 18-Year-Old Bug Just Made That a Crisis.
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Six Months Ago Kubernetes Retired Ingress NGINX. An 18-Year-Old Bug Just Made That a Crisis.
NGINX Rift is critical. Unpatched in the abandoned controller that half of cloud native runs. The only fix you can buy comes from a vendor… Continue reading on
The One Test That Never Fails (But Is Still Worth Writing)
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The One Test That Never Fails (But Is Still Worth Writing)
On testing configuration, contracts, and startup conditions — not just business logic. Continue reading on Medium »
My CI/CD Architecture
Dev.to · Akash Santra ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
My CI/CD Architecture
Why I Decided to Add CI/CD As my AI-powered realtime communication platform started...
The AWS Service Quotas That Will Take Down Your Production at 3 AM (And You Cannot Raise Them Fast…
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The AWS Service Quotas That Will Take Down Your Production at 3 AM (And You Cannot Raise Them Fast…
Hard limits, scaling lags, and the architectural walls that no support ticket can fix. Continue reading on Medium »
I stopped uploading my files to random websites and built my own tools instead
Dev.to · PureTools ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
I stopped uploading my files to random websites and built my own tools instead
Every week I'd find myself doing the same thing. Googling "compress PDF Every week I'd find myself...
Cortex vs VictoriaMetrics: Why Scalable Prometheus Is Not Always the Best Prometheus
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Cortex vs VictoriaMetrics: Why Scalable Prometheus Is Not Always the Best Prometheus
A few years ago, observability was relatively simple. Most teams had a Prometheus instance scraping metrics from a handful of services… Continue reading on Medi
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Day 12: Linux Network Services | 100 Days of Devops
Guide for 12th task of 100 Days of Devops from KodeKloud Continue reading on Medium »
Navigating the Hidden Dangers of Server Growth with Treasure Hunt Engine
Dev.to · theresa moyo ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Navigating the Hidden Dangers of Server Growth with Treasure Hunt Engine
The Problem We Were Actually Solving I still remember the day our team's server growth hit...
Auto versioning + changelog generation using Github Action
Dev.to · Kyle Y. Parsotan ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Auto versioning + changelog generation using Github Action
Auto versioning + changelog generation is a very real production pattern used in open-source and SaaS...
Observability in 2026: Distributed Tracing Replaced Logs, and OpenTelemetry Won
Dev.to · ZNY ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Observability in 2026: Distributed Tracing Replaced Logs, and OpenTelemetry Won
Observability in 2026: Distributed Tracing Replaced Logs, and OpenTelemetry Won The...
Dev.to AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Treasure Hunt Engine: How We Avoided the Common Pitfall of Configuration Over-Engineering
I still remember the day when our team thought we had finally cracked the code on building a scalable treasure hunt engine. We had implemented a shiny new AI mo