Back-Of-The-Envelope Estimation / Capacity Planning
Skills:
Systems Design Basics90%
Key Takeaways
The video discusses the importance of back-of-the-envelope estimation in system design, providing tips and techniques for effectively using this tool to quickly sanity check designs and estimate numbers such as requests per second and storage requirements. It covers examples of estimating tweets created per second on Twitter and storage required for multimedia files, and emphasizes the importance of getting within an order of magnitude rather than striving for absolute accuracy.
Full Transcript
[Music] back of the envelope math is a very useful tool in a system design toolbox in this video we'll go over how and when to use it and share some tips on using it effectively let's Dive Right In experienced developer use spec of the envelope math to quickly sanity check a design in these cases absolute accuracy is not important usually it is good enough to get within an order of magnitude or two of the actual numbers we are looking for for example if the math says at our scale our web service needs to handle 1 million requests per second and each web server could only handle about 10,000 request per second we learn two things quickly one we learn we'll need a cluster of web servers with a low balancing in front of them we will need about 100 web servers another example if the map shows that that the database needs to handle about 10 queries per second at Peak it means that a single database server could handle the low for a while and there's no need to consider shorting or caching for a while now let's go over some of the most popular numbers to estimate the most useful by far is requests per second at the service level or queries per second at the database level let's go over the common inputs and the request per second calculation first input is Da or daily active users this number should be easy to obtain sometimes the only available number would be monthly active users in that case estimate the da as a percentage of the ma the second input is the estimate of the usage per da of the service we're designing for for example not Everyone Active on Twitter makes a post so only a percentage does that so 10 to 25% seems to be reasonable again it doesn't have to be exact getting within an order of magnitude is usually fine now the third input is a scaling Factor usage rate for a service usually has Peaks and values throughout the day we need to estimate how much higher the traffic would Peak compared to the average this would reflect the estimated request per second Peak where the design could potentially break for example for a Serv like Google Maps the usage rate during commute hours could be five times higher than average another example for Rise sharing service like uber weekend nights could have twice as many riders as average now let's go over an example we estimate the number of tweets created per second on Twitter look these numbers are made up and they are not official numbers from Twitter let's assume Twitter has 300 million Ma and 50% % of the ma use Twitter daily so that's about 150 million da next we estimate that about 25% of Twitter da make tweets and each one on average makes two tweets that is 25% time 2 so it this is 0.5 tweets per da for the scaling Factor we estimate that most people tweet in the morning when they get up and can't wait to share what they dream about the night before and that's spikes The Tweak created per traffic to Choice the average when the US East Coast wakes up let's say now we have enough to calculate the peak tweaks created per second we have 150 million da time 0.5 tweets per da times two time scaling Factor divided by 86,400 seconds in a day now that is roughly about 1,500 tweets created per second let's go over the techniques we use to simplify the calculations first we convert all big numbers to scientific notation doing math on really big numbers is very error prone by converting big numbers to scientific notation part of the modification becomes simple addition and division becomes subtraction in example above 150 million da becomes 150 * 10 6 or 1.5 * 10 to 8th there are 86 ,400 seconds in a day reun it up to 100,000 seconds and that becomes 10 to the 5th seconds and since it's a division 10 the 5th becomes 10 to the minus 5th next we group all the power of 10 together and then all other numbers together so the math becomes 1.5 * .5 * 2 and uh 10 8 * 10 to the - 5th which equal to 10 the 8 - 5 which is 10 3r putting all together is the 1.5 * 10 3r or500 now with practice we should be able to convert a large number to a scientific notation in seconds and here are some handy conversions we should memorize as an example we should know by heart that 10 to 12 is a trillion or a terabyte and when we see a number like 50 terabyte we should be able to convert it quickly to 50 * 10 12 which is 5 * 10 13 we're going to ignore the fact that 1 kilobyte is actually two to the 10th bytes or 10 24 bytes and not 1 th000 bytes we don't need the that degree of accuracy for back of the envelope math now let's wrap up by going through one last example we'll estimate how much storage is required for storing multimedia files for tweets we know from previous previous example that there are about 150 million tweets per day now we need an estimate on a percentage of tweets that could contain multimedia content and how large those files are on average without meticulous research we estimate that 10% of tweets contain pictures and they're about 100 kilobyte each and 1% of all the tweets contain videos and they're about 100 megabyte each we further assume that the files are replicated with three copies each and that Twitter will keep the media for 5 years now here's the math for storing pictures we have the following we have 150 million tweets times one in 10 tweets with pictures times 100 CLE per picture times 400 days in a year times 5 years times three copies so that turns into 1.5 * 10 to 8 * 10 Theus 1 * 10 the 5th * 4 * 10 to the 2nd * 5 * 3 again we group the powers of tens together this becomes 1.5 * 4 * 5 * 3 which is 90 and 10 the 8 - 1 + 5 + 2 which is 10 14th and that becomes 9 * 10^ 15 which is from The Table 9 pabes now for storing videos we take yet another shortcut since videos on average 100 megabyte each while pictures 100 kilobyte a video is a thousand time bigger than a picture on average second only 1% of tweets contain a video while pictures appear in 10% of all the tweets so videos are oneth as popular now putting the math together the total video storage is 1,000 * one10 of pictures uh storage which is 100 * 9 paby or 900 pyes in conclusion back of the envelope math is a very useful tool in a system design toolbox now don't over index on Precision getting within an order of magnitude is usually enough to inform and validate our design if you'd like to learn more about system design check out our books and Weekly Newsletter Please Subscribe if you learn something new thank you so much and we'll see you next time right
Original Description
Weekly system design newsletter: https://bit.ly/3tfAlYD
Checkout our bestselling System Design Interview books:
Volume 1: https://amzn.to/3Ou7gkd
Volume 2: https://amzn.to/3HqGozy
Other things we made:
Digital version of System Design Interview books: https://bit.ly/3mlDSk9
Twitter: https://bit.ly/3HqEz5G
LinkedIn: https://bit.ly/39h22JK
ABOUT US:
Covering topics and trends in large-scale system design, from the authors of the best-selling System Design Interview series.
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from ByteByteGo · ByteByteGo · 13 of 60
1
2
3
4
5
6
7
8
9
10
11
12
▶
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
What happens when you type a URL into your browser?
ByteByteGo
System Design: Why is Kafka fast?
ByteByteGo
System Design: How to store passwords in the database?
ByteByteGo
Big Misconceptions about Bare Metal, Virtual Machines, and Containers
ByteByteGo
FAANG System Design Interview: Design A Location Based Service (Yelp, Google Places)
ByteByteGo
Scan To Pay in 2 Minutes
ByteByteGo
Consistent Hashing | Algorithms You Should Know #1
ByteByteGo
System Design: Why is single-threaded Redis so fast?
ByteByteGo
HTTP/1 to HTTP/2 to HTTP/3
ByteByteGo
What Is REST API? Examples And How To Use It: Crash Course System Design #3
ByteByteGo
The Secret Sauce Behind NoSQL: LSM Tree
ByteByteGo
Bloom Filters | Algorithms You Should Know #2 | Real-world Examples
ByteByteGo
Back-Of-The-Envelope Estimation / Capacity Planning
ByteByteGo
How To Choose The Right Database?
ByteByteGo
How Does Live Streaming Platform Work? (YouTube live, Twitch, TikTok Live)
ByteByteGo
Latency Numbers Programmer Should Know: Crash Course System Design #1
ByteByteGo
What Are Microservices Really All About? (And When Not To Use It)
ByteByteGo
How Does Apple/Google Pay Work?
ByteByteGo
Proxy vs Reverse Proxy (Real-world Examples)
ByteByteGo
What is API Gateway?
ByteByteGo
What Is GraphQL? REST vs. GraphQL
ByteByteGo
What Is Single Sign-on (SSO)? How It Works
ByteByteGo
What Is A CDN? How Does It Work?
ByteByteGo
What is RPC? gRPC Introduction.
ByteByteGo
SSL, TLS, HTTPS Explained
ByteByteGo
FANG Interview Question | Process vs Thread
ByteByteGo
What is OSI Model | Real World Examples
ByteByteGo
CAP Theorem Simplified
ByteByteGo
Kubernetes Explained in 6 Minutes | k8s Architecture
ByteByteGo
CI/CD In 5 Minutes | Is It Worth The Hassle: Crash Course System Design #2
ByteByteGo
Why Is System Design Interview Important?
ByteByteGo
8 Key Data Structures That Power Modern Databases
ByteByteGo
System Design Interview: A Step-By-Step Guide
ByteByteGo
Top 5 Redis Use Cases
ByteByteGo
Debugging Like A Pro
ByteByteGo
But What Is Cloud Native Really All About?
ByteByteGo
Everything You Need to Know About DNS: Crash Course System Design #4
ByteByteGo
The Most Beloved Burger for Developers
ByteByteGo
10+ Key Memory & Storage Systems: Crash Course System Design #5
ByteByteGo
Cache Systems Every Developer Should Know
ByteByteGo
Top 7 ChatGPT Developer Hacks
ByteByteGo
How ChatGPT Works Technically | ChatGPT Architecture
ByteByteGo
10 Key Data Structures We Use Every Day
ByteByteGo
Top 7 Most-Used Distributed System Patterns
ByteByteGo
Secret To Optimizing SQL Queries - Understand The SQL Execution Order
ByteByteGo
Amazon Prime Video Ditches AWS Serverless, Saves 90%
ByteByteGo
Top 6 Most Popular API Architecture Styles
ByteByteGo
Top 5 Most-Used Deployment Strategies
ByteByteGo
How Discord Stores TRILLIONS of Messages
ByteByteGo
Uncovering Stack Overflow's Shocking Architecture
ByteByteGo
OAuth 2 Explained In Simple Terms
ByteByteGo
Demystifying the Unusual Evolution of the Netflix API Architecture
ByteByteGo
1 Year Of YouTube | Best System Design Series
ByteByteGo
DevOps vs SRE vs Platform Engineering | Clear Big Misconceptions
ByteByteGo
Top 7 Ways to 10x Your API Performance
ByteByteGo
Why Google and Meta Put Billion Lines of Code In 1 Repository?
ByteByteGo
Git MERGE vs REBASE: Everything You Need to Know
ByteByteGo
Top 6 Load Balancing Algorithms Every Developer Should Know
ByteByteGo
Algorithms You Should Know Before System Design Interviews
ByteByteGo
Top 5 Most Used Architecture Patterns
ByteByteGo
More on: Systems Design Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
How I Structured My Next.js 14 App Router Project — And Why It Scales
Dev.to · Mbanefo Emmanuel Ifechukwu
The Hardest Part Of Microservices Is Undoing What Already Succeeded
Medium · Programming
What OOP Actually Buys You (And Why “Real World Modeling” Is a Lie)
Medium · Programming
Data Partitioning in System Design: Why Every Scalable Application Depends on It
Medium · Programming
🎓
Tutor Explanation
DeepCamp AI