Auth0 MongoDB Gets Overloaded Causes an Major Outage

Hussein Nasser · Beginner ·🏗️ Systems Design & Architecture ·5y ago
On April 20th, 2021, Auth0 US environments had a full-service disruption. The outage was due purely to performance degradation on their internal backend systems. Auth0 has released a root cause analysis report and In this video, I will summarize their backend architecture and the cause of the outage. Let us get started Just an FYI I’m only going through the backend pieces that auth0 explained in their RCA which isn’t their entire backend infrastructure. The Auth0 feature flag service returns configuration values for tenants and subscribers which is stored in MongoDB. (Although Auth0 doesn’t mention this on the RCA we know their core database MongoDB from a previous blog that I’ll reference below ) The feature flag service also has a caching layer which the frontend apis can query for performance. If cache doesn’t have the value or takes too long to answer, the frontend api queries the flag feature directly which hits the MongoDB directly. Auth0 also execute multiple queries running on the backend for routine maintenance. These queries where identified as inefficient and consume lots of resources primarily because they didn’t have proper indexes. As a result of a sudden increase in traffic, the flag service caching layer was saturated which caused latency in response time, eventually causing the frontend apis to massively query the feature flag service directly putting even more load on the MongoDB. Combined with the backend inefficient queries, MongoDB received even more queries and consumed more IO resources which caused latency to increase even further The auto-scaling triggered in the feature service as a result of latency in response time to scale the service adding even more load to MongoDB this eventually overloaded the database until it exceeded the available disk I/O which caused the full outage. If you love this content I think you will enjoy this video where I discuss the Slack outage, I learned so much about Slack backend architecture f
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Hussein Nasser · Hussein Nasser · 0 of 60

← Previous Next →
1 Extending ArcObjects (IGeometry) - 01 - Getting Started
Extending ArcObjects (IGeometry) - 01 - Getting Started
Hussein Nasser
2 Extending ArcObjects  (IGeometry) - 02 - The Document, The Map and The Layers
Extending ArcObjects (IGeometry) - 02 - The Document, The Map and The Layers
Hussein Nasser
3 Channel Update - New Book, New Job, New Videos
Channel Update - New Book, New Job, New Videos
Hussein Nasser
4 Learn Programming with VB.NET - 01 - Getting Started
Learn Programming with VB.NET - 01 - Getting Started
Hussein Nasser
5 Learn Programming with VB.NET - 02 - Classes and Objects (Part 1)
Learn Programming with VB.NET - 02 - Classes and Objects (Part 1)
Hussein Nasser
6 Learn Programming with VB.NET - 03 - Classes and Objects (Part 2)
Learn Programming with VB.NET - 03 - Classes and Objects (Part 2)
Hussein Nasser
7 Learn Programming with VB.NET - 04 - User Interface
Learn Programming with VB.NET - 04 - User Interface
Hussein Nasser
8 Learn Programming with VB.NET - 05 - By Value v. By Reference
Learn Programming with VB.NET - 05 - By Value v. By Reference
Hussein Nasser
9 Learn Programming with VB.NET - 06 - Variable size, 32 bit vs 64 bit
Learn Programming with VB.NET - 06 - Variable size, 32 bit vs 64 bit
Hussein Nasser
10 Learn Programming with VB.NET - 07 - Conditional Statements
Learn Programming with VB.NET - 07 - Conditional Statements
Hussein Nasser
11 Learn Programming with VB.NET - 08 - Inheritance
Learn Programming with VB.NET - 08 - Inheritance
Hussein Nasser
12 Learn Programming with VB.NET - 09 - Strategy Design Pattern
Learn Programming with VB.NET - 09 - Strategy Design Pattern
Hussein Nasser
13 Learn Programming with VB.NET - 10 -  How did I learn programming
Learn Programming with VB.NET - 10 - How did I learn programming
Hussein Nasser
14 IGeometry 2016 Retrospective - Channel Update
IGeometry 2016 Retrospective - Channel Update
Hussein Nasser
15 Javascript by Example - The Vook
Javascript by Example - The Vook
Hussein Nasser
16 Vlog - Keep your servers close and your database closer
Vlog - Keep your servers close and your database closer
Hussein Nasser
17 Vlog - Client/Server Programming Languages
Vlog - Client/Server Programming Languages
Hussein Nasser
18 Javascript By Example L1E01 - Getting Started
Javascript By Example L1E01 - Getting Started
Hussein Nasser
19 Persistent Connections (Pros and Cons)
Persistent Connections (Pros and Cons)
Hussein Nasser
20 Javascript By Example L1E02 - Building the Calculator Interface
Javascript By Example L1E02 - Building the Calculator Interface
Hussein Nasser
21 Happy new Year from IGeometry!
Happy new Year from IGeometry!
Hussein Nasser
22 Synchronous v. Asynchronous
Synchronous v. Asynchronous
Hussein Nasser
23 Javascript By Example L1E03 - Displaying the Digits on Calculator Screen
Javascript By Example L1E03 - Displaying the Digits on Calculator Screen
Hussein Nasser
24 Show Your Work. Blog, Vlog, Write, Create and Develop!
Show Your Work. Blog, Vlog, Write, Create and Develop!
Hussein Nasser
25 Relational Database Atomicity Explained By Example
Relational Database Atomicity Explained By Example
Hussein Nasser
26 Javascript By Example L1E04 - Operators, All Clear with Arrow Functions
Javascript By Example L1E04 - Operators, All Clear with Arrow Functions
Hussein Nasser
27 What Comes First, User Experience or Software Architecture?
What Comes First, User Experience or Software Architecture?
Hussein Nasser
28 Javascript By Example L1E05 -  Evaluate the Calculator Expressions with eval
Javascript By Example L1E05 - Evaluate the Calculator Expressions with eval
Hussein Nasser
29 Fastest Way to Learn Programming Language or Technology
Fastest Way to Learn Programming Language or Technology
Hussein Nasser
30 Javascript By Example L1E06 -  Fix Leading Zero Bug with Conditions
Javascript By Example L1E06 - Fix Leading Zero Bug with Conditions
Hussein Nasser
31 Stateful vs Stateless Applications (Explained by Example)
Stateful vs Stateless Applications (Explained by Example)
Hussein Nasser
32 Javascript By Example L1E07 - Running our Calculator on the Mobile Phone
Javascript By Example L1E07 - Running our Calculator on the Mobile Phone
Hussein Nasser
33 Advice for New Software Engineers and Developers
Advice for New Software Engineers and Developers
Hussein Nasser
34 Why JSON is so Popular?
Why JSON is so Popular?
Hussein Nasser
35 Building Scalable Software - SLA, HS, VS
Building Scalable Software - SLA, HS, VS
Hussein Nasser
36 Vlog (Istanbul) - Datacenter Proximity
Vlog (Istanbul) - Datacenter Proximity
Hussein Nasser
37 Should Software Engineers Learn Bleeding-Edge Technologies?
Should Software Engineers Learn Bleeding-Edge Technologies?
Hussein Nasser
38 Do Developers Build Bad User Interfaces/Experience?
Do Developers Build Bad User Interfaces/Experience?
Hussein Nasser
39 Learn By Doing.
Learn By Doing.
Hussein Nasser
40 I Wrote Bad Front-End Code That Broke Chrome
I Wrote Bad Front-End Code That Broke Chrome
Hussein Nasser
41 My Story
My Story
Hussein Nasser
42 Vlog - Horizontal vs Vertical Scaling
Vlog - Horizontal vs Vertical Scaling
Hussein Nasser
43 Can User Experience Help Build Better Rest API?
Can User Experience Help Build Better Rest API?
Hussein Nasser
44 Reverse engineering Instagram in flight mode
Reverse engineering Instagram in flight mode
Hussein Nasser
45 The Benefits of the 3-Tier Architecture (e.g. REST API)
The Benefits of the 3-Tier Architecture (e.g. REST API)
Hussein Nasser
46 Stateless v. Stateful Architecture (Podcast)
Stateless v. Stateful Architecture (Podcast)
Hussein Nasser
47 The evolution from virtual machines to containers
The evolution from virtual machines to containers
Hussein Nasser
48 Proxy vs. Reverse Proxy (Explained by Example)
Proxy vs. Reverse Proxy (Explained by Example)
Hussein Nasser
49 Canary Deployment (Explained by Example)
Canary Deployment (Explained by Example)
Hussein Nasser
50 No Excuses
No Excuses
Hussein Nasser
51 Synchronous vs Asynchronous Applications (Explained by Example)
Synchronous vs Asynchronous Applications (Explained by Example)
Hussein Nasser
52 What is an Asynchronous service?
What is an Asynchronous service?
Hussein Nasser
53 Difference between Client Polling vs Server Push in Notifications
Difference between Client Polling vs Server Push in Notifications
Hussein Nasser
54 Software vs. Hardware AdBlockers (Explained by Example)
Software vs. Hardware AdBlockers (Explained by Example)
Hussein Nasser
55 HTTP Caching with E-Tags -  (Explained by Example)
HTTP Caching with E-Tags - (Explained by Example)
Hussein Nasser
56 Simple Object Access Protocol Pros and Cons (Explained by Example)
Simple Object Access Protocol Pros and Cons (Explained by Example)
Hussein Nasser
57 Nodejs Express "Hello, World"
Nodejs Express "Hello, World"
Hussein Nasser
58 Reverse Engineering Instagram feed
Reverse Engineering Instagram feed
Hussein Nasser
59 Popup Modal Dialog with Javascript and HTML
Popup Modal Dialog with Javascript and HTML
Hussein Nasser
60 MIME and Media Type sniffing explained and the type of attacks it leads to
MIME and Media Type sniffing explained and the type of attacks it leads to
Hussein Nasser

Related AI Lessons

SDD is now my go-to when developing a Website / Software
Use Story-Driven Development (SDD) for efficient website and software development
Dev.to · Muhamad Sulaiman
Welcome to the Distributed Systems World — The Challenges Nobody Warned You About
Learn about the unexpected challenges of distributed systems and how to overcome them
Dev.to · mohamed Tayel
Advanced Rust Concepts: Iterators, Closures, Generics & More (Part 2)
Master advanced Rust concepts like iterators, closures, and generics to improve your programming skills
Dev.to · mihir mohapatra
Goroutine Scheduling: GMP Model, Schedule Loop, Preemption & Stack Management
Learn how the Go scheduler works, including the GMP model, schedule loop, preemption, and stack management, to improve your concurrent programming skills
Dev.to · James Lee
Up next
¿Cómo elimino o termino un servicio de Amazon ECS?
Amazon Web Services
Watch →