Tech Skills

DevOps & Cloud

CI/CD, Docker, Kubernetes, AWS, GCP, Azure, monitoring and infrastructure as code

1,333
lessons
Skills in this topic
View full skill map →
Linux & CLI
beginner
Navigate the filesystem, manage permissions, and use pipes
Docker & Containers
beginner
Write a production-ready Dockerfile
Cloud Fundamentals
intermediate
Deploy a web app on AWS EC2 or App Engine
Kubernetes
intermediate
Deploy a multi-container app on a k8s cluster
CI/CD Pipelines
intermediate
Build a CI pipeline that runs tests on every PR
Infrastructure as Code
advanced
Provision a full VPC with Terraform
All Reads (892) Articles (463)Blog Posts (311)Tutorials (114)News (4)
Dev.to AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Day the Veltrix Configs Blew Up My Treasure Hunt Engine
The Problem We Were Actually Solving Wed shipped the Treasure Hunt Engine six months earlier as a real-time scavenger hunt overlay for Hytale. Players raced thr
Multi-Architecture Docker Builds for Node.js: From Apple Silicon to AWS Graviton
Dev.to · Raju Dandigam ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Multi-Architecture Docker Builds for Node.js: From Apple Silicon to AWS Graviton
Build Docker images that work across ARM64 and AMD64 architectures using Docker Buildx for Node.js applications
Veltrix Operator Pitfalls: Why I Had to Rip Out the Treasure Hunt Engine to Save My Server
Dev.to · pretty ncube ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Veltrix Operator Pitfalls: Why I Had to Rip Out the Treasure Hunt Engine to Save My Server
The Problem We Were Actually Solving I will never forget the day our server started to...
Two Monitoring Stacks Are Better Than One
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Two Monitoring Stacks Are Better Than One
Lessons from running a production SaaS out of Harare Continue reading on Medium »
Eval-Driven Agentic AI Development: The Most Important Practice Nobody Is Doing (And What I Got…
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Eval-Driven Agentic AI Development: The Most Important Practice Nobody Is Doing (And What I Got…
Moving from 20 to 200 cases. How to build automated CI/CD evaluation gates for AI systems using real production failures, not assumptions. Continue reading on P
The Engineering Pain Points Behind Building Excel clone — SheetWise
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Engineering Pain Points Behind Building Excel clone — SheetWise
Hello Everyone, My name is Sujal Sharma, I am an aspiring cloud and devops engineer. Continue reading on Medium »
InfoQ AI/ML ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
How LinkedIn Identified a Kernel Lock Contention Issue Causing Recurring System Freezes
When LinkedIn engineers encountered short-lived, recurring outages where the database powering their user feed became unavailable and then recover without leavi
The Docker Vs Kubernetes Debate Is Dead. AI Just Changed The Whole Game.
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Docker Vs Kubernetes Debate Is Dead. AI Just Changed The Whole Game.
I used to think the Docker vs Kubernetes debate was a serious debate. Continue reading on Medium »
10+ DevOps & SRE resources everyone should check out in the AI age — 2026
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
10+ DevOps & SRE resources everyone should check out in the AI age — 2026
This is an updated forward looking post. My original post can still be found here but this one focuses on current resources factoring AI… Continue reading on Go
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Why Your Chainlit App on Azure Container Apps Says “Could Not Reach the Server” (and How to Fix It)
TL;DR — If your Chainlit app loads the UI fine but shows “Could not reach the server” with no errors in your container logs, you’re… Continue reading on Medium
Small CI Warnings Become Release Risk
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Small CI Warnings Become Release Risk
GitHub Actions gave us one of those warnings that is easy to ignore: Continue reading on Medium »
Your Cloud Bill Is Lying to You
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Your Cloud Bill Is Lying to You
I blamed traffic growth for months — until I found the hidden infrastructure decisions quietly burning thousands of dollars every week. Continue reading on Medi
CI/CD avec GitHub Actions
Dev.to · Ulrich (Houngbe) ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
CI/CD avec GitHub Actions
CI/CD avec GitHub Actions : Guide Complet GitHub Actions révolutionne l'intégration et le...
The Day Our Treasure Hunt Engine Blew Up at 3 AM
Dev.to · Lillian Dube ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Day Our Treasure Hunt Engine Blew Up at 3 AM
The Problem We Were Actually Solving It started with a simple requirement: players should be able to...
Magento 2 Nginx Optimization for High Traffic — Complete Server Tuning Guide
Dev.to · Magevanta ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Magento 2 Nginx Optimization for High Traffic — Complete Server Tuning Guide
Tune Nginx for Magento 2 to handle high traffic without breaking a sweat. Worker processes, gzip, keepalive, microcaching, SSL/TLS, and more — all with real con
Why Hytale Treasure Hunt Servers Throttle at 100 Players (And How We Fixed It)
Dev.to · pretty ncube ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Why Hytale Treasure Hunt Servers Throttle at 100 Players (And How We Fixed It)
The Problem We Were Actually Solving Our public alpha weekend attracted 4,200 concurrent...
The Treasure Hunt Engine Blew Up My Inbox at 3 AM
Dev.to · Lillian Dube ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Treasure Hunt Engine Blew Up My Inbox at 3 AM
The Problem We Were Actually Solving The Treasure Hunt Engine wasnt supposed to be a...
The Day Treasure Hunt Broke My Caches—And How We Fixed It
Dev.to · Lillian Dube ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Day Treasure Hunt Broke My Caches—And How We Fixed It
The Problem We Were Actually Solving The treasure hunt engine used a single Redis sorted...
I built a zsh cleanup script for macOS dev machines — and learned more than I expected
Dev.to · Mili Cardenas ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
I built a zsh cleanup script for macOS dev machines — and learned more than I expected
If you do iOS and Android development on a Mac, your disk is quietly dying. Between Xcode's...
Day 18: Part 3 of Linux User & Access Management
Medium · Programming ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Day 18: Part 3 of Linux User & Access Management
Understanding su, sudo, and Safe Access Control in Linux Continue reading on Medium »
Stop Cloning Entire Repos for Your Doc Builds
Dev.to · sai pramod upadhyayula ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Stop Cloning Entire Repos for Your Doc Builds
Your docs live next to your code. That's the docs-as-code promise — version control, pull request...
Dev.to AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Multiplexing SSH Connections with Control Master: Speed Up Deployments and Automation
Every SSH command you run opens a fresh TCP connection and completes a full cryptographic handshake. Here's how to do it once and reuse it hundreds of times. If
What a 4-Hour NOC Response SLA Actually Means at 3am
Dev.to · Alex ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
What a 4-Hour NOC Response SLA Actually Means at 3am
Originally published at www.tothenoc.com SLAs are contracts. What matters is...
What is a Load Balancer? Strategies, Failover & Differences from Gateways/Proxies
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
What is a Load Balancer? Strategies, Failover & Differences from Gateways/Proxies
Load balancing is the process of distributing traffic across multiple servers to ensure high availability, scalability, and performance. A… Continue reading on
The Container Runtime Nobody Told You About (And Four Others)
Dev.to · Don Johnson ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Container Runtime Nobody Told You About (And Four Others)
gVisor, Kata, Firecracker, and WASM/WASI demystified with a single Go app, real benchmark numbers, and the ultimate use-case map for each.
Harness Engineering, Part 3: How Your Delivery System Boots Up
Medium · AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Harness Engineering, Part 3: How Your Delivery System Boots Up
A 10-part series on building software delivery systems that actually work. Today: the startup sequence that nobody talks about until it… Continue reading on Med
Systemd Timers: Modern Task Scheduling (Better than Cron?)
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Systemd Timers: Modern Task Scheduling (Better than Cron?)
Time to Move Beyond Cron Continue reading on Medium »
A Practical Guide to Structuring a Minimal Helm Repo for Multi-Environment Deployments
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
A Practical Guide to Structuring a Minimal Helm Repo for Multi-Environment Deployments
One of the hardest parts of scaling Kubernetes deployments isn’t Helm itself. Continue reading on Medium »
Install Red Hat Enterprise Linux 10.2 in a VM From VMware vCenter
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Install Red Hat Enterprise Linux 10.2 in a VM From VMware vCenter
By now, on May 26th 2026, the latest version of Red Hat Enterprise Linux (RHEL) is 10.2. And we’re gonna do some test to install RHEL 10.2… Continue reading on
Inside Load Balancers - The Hidden Traffic Controllers of the Internet
Dev.to · Vipul ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Inside Load Balancers - The Hidden Traffic Controllers of the Internet
Whenever thousands or even millions of users open an application at the same time, one big question...
A Field Guide to AWS CI/CD: CodeBuild, CodeDeploy, and CodePipeline End to End
Dev.to · miruky ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
A Field Guide to AWS CI/CD: CodeBuild, CodeDeploy, and CodePipeline End to End
Production-grade CI/CD on AWS, from CodeBuild and CodeDeploy through CodePipeline V2, deployment strategies, IAM, and day-2 operations. A complete hands-on fiel
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Her Sunucu İçin Ayrı Script Yazmadan Linux Üzerinde MySQL, PostgreSQL ve Klasör Yedeklerini…
Yedekleme süreçleri çoğu zaman küçük bir shell script ile başlar. Continue reading on Medium »
Compass v1.1.0 · we shipped a memory plugin that catches its own consumption drift
Dev.to · chunxiaoxx ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Compass v1.1.0 · we shipped a memory plugin that catches its own consumption drift
Recall != consumption. Same anti-pattern reproduced across sessions despite recall hitting the right files. Three layers of fix and a capability-driven governan
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
When Platforms Optimise for Control
How internal platforms drift from enabling engineers to constraining them Continue reading on Medium »
Deploy a Secure Containerized App on Amazon ECS Fargate Using ECR and Secrets Manager
Dev.to · JOSE PRAVEEN ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Deploy a Secure Containerized App on Amazon ECS Fargate Using ECR and Secrets Manager
Deploy a Secure Containerized App on Amazon ECS Fargate Using ECR and Secrets Manager In...
Turning You Into a Power User: Hybrid Memory, SSH Cloak, and Password Vaulting With VEKTOR
Dev.to · Vektor Memory ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Turning You Into a Power User: Hybrid Memory, SSH Cloak, and Password Vaulting With VEKTOR
A 10-minute tutorial that covers how we manage servers, store AES-256 secrets, and maintain...
The Moment the Jaeger Tracer Exhausted Itself and What We Switched To
Dev.to · pretty ncube ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Moment the Jaeger Tracer Exhausted Itself and What We Switched To
The Problem We Were Actually Solving Our treasure-hunt engine at Veltrix was not exploding; it was...
GPU Scheduling in Kubernetes Explained: What Actually Works for AI and High-Performance Workloads
Medium · AI ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
GPU Scheduling in Kubernetes Explained: What Actually Works for AI and High-Performance Workloads
Running CPU workloads in Kubernetes is easy. Continue reading on AegisOps »
SCS-Lab1 — CloudTrail: Trail + S3 + KMS + Log Validation
Dev.to · Luis Eduardo Lunar Guevara ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
SCS-Lab1 — CloudTrail: Trail + S3 + KMS + Log Validation
Región: us-east-1 Duración estimada: 35–55 minutos Costo-riesgo: Medio Certificación: AWS Certified...
Linux Process Management & Cron Jobs — Monitor, Control, and Automate Like a DevOps Engineer
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Linux Process Management & Cron Jobs — Monitor, Control, and Automate Like a DevOps Engineer
Two skills that keep production servers alive and running on schedule Continue reading on Medium »
RabbitMQ Cluster with Quorum Queues — DevOps Zero to Hero Guide
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
RabbitMQ Cluster with Quorum Queues — DevOps Zero to Hero Guide
RabbitMQ is widely used for communication between applications and microservices. Running RabbitMQ on a single server can create a single… Continue reading on M
9 Commands I Run in the First 3 Minutes of Every Production Incident
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
9 Commands I Run in the First 3 Minutes of Every Production Incident
After 11 years on-call, these are the only ones that matter when the pager wakes me up. Continue reading on AWS in Plain English »
Default Config Is a Slow Config
Dev.to · pretty ncube ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Default Config Is a Slow Config
The Problem We Were Actually Solving Looking back, we were trying to solve the wrong...
Stop Treating Your Code as Immutable: The Art of the Rollback
Medium · DevOps ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Stop Treating Your Code as Immutable: The Art of the Rollback
If you can’t push the “undo” button in under 60 seconds, you aren’t deploying software — you’re playing Russian roulette with your users. Continue reading on St
SSH Login Delays: The 10-Second Wait That Drives Us Crazy
Dev.to · Schiff Heimlich ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
SSH Login Delays: The 10-Second Wait That Drives Us Crazy
The Problem Every sysadmin has been there: you SSH into a server and wait... and wait......
When Nginx Timeouts Weren’t Nginx: Debugging Socket Leak in Production
Medium · Python ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
When Nginx Timeouts Weren’t Nginx: Debugging Socket Leak in Production
A detective story about a silent application killer, the TCP lifecycle, and why your infrastructure metrics might be lying to you. Continue reading on Medium »
Harness Engineering, Part 2: The Architecture of a Modern Delivery System
Medium · LLM ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
Harness Engineering, Part 2: The Architecture of a Modern Delivery System
A 10-part series on building software delivery systems that actually work. Today: the six core pieces every harness is built from. Continue reading on Medium »
The Hidden Cost of Downtime: How SRE Error Budgets Protect National Economic Infrastructure
Dev.to · Nijo George Payyappilly ☁️ DevOps & Cloud ⚡ AI Lesson 2w ago
The Hidden Cost of Downtime: How SRE Error Budgets Protect National Economic Infrastructure
At 9:30 AM on August 1, 2012, Knight Capital Group's trading systems began executing a catastrophic...