Retry Pattern: The Secret to Resilient Python Code
Key Takeaways
This video demonstrates the retry pattern in Python to make code more robust and fault-tolerant when interacting with APIs, networks, or LLMs, using tools like SER API, Chuck Norris API, and tenacity library.
Full Transcript
Here's an example of piece of code that calls the Chuck Norris API to retrieve a random joke. Let's run this and see what happens. Oops, that's a crash. Let's try this again. Ah, and now it works. I didn't change anything. So clearly, if there is some temporary issue with the API that you're connecting to, the whole script crashes. Now, of course, this is a bad example. Actually, this API never fails because, well, it's the Chuck Norris API. Merely suggesting the possibility probably means that this is my last video ever. Anyway, today I'll show you how to stop your code from crashing randomly whenever you're connecting with APIs or other services by using the retry pattern. I'll go step by step and show you simple retries, exponential backoff, decorators, fallbacks, and more. This video is sponsored by SER API. More about that later. The problem that I encountered at the start of the video is so-called transient failure. Most modern systems rely on a bunch of external services, APIs, databases, large language models, you name it. And these services sometimes fail briefly, even when your code is correct. Now, the retry pattern wraps an operation that might fail and automatically retries it before giving up. It's actually a really simple way to make your code more robust and fault tolerant. And by the way, maybe you noticed I had this little wound here. Actually, what happened is that I was in a fight to the death with Chuck Norris. I'm really attempting fate here. Anyway, while I'm still alive, let's build a simple retry function ourselves. And as you can see, what I secretly added to the example is a random failure because, like I said, the Chug Norris API never fails. So let's create a function called retry and this function is going to get an operation which is going to be a callable. So it will be a function and it will call this function up to three times and we want it to wait let's say 1 second between the attempts. So from typing I'm going to import the callable type and let's assume this operation doesn't take any arguments because otherwise we don't know how to call it. And well, we could make the return type any but instead what we can also do is make retry a generic function. So it's going to return something of type T. So that's the operation. Then we have the number of retries that we want to have and let's say that's by default it's three and we'll also have a delay in seconds. So let's say that's going to be 1 second by default. And then this is going to return a type t because that's the result of the operation. So for attempts in range let's say from one to retries + one. We're going to try to return the call of the operation like so. And then if there is some exception and this is actually one of the rare cases where you want to basically catch everything because we're actually going to reerase this if we can't retry anymore. So if uh attempts equals retries, that means we have now had the maximum number of retries. We're going to erase the exception. Otherwise, we're going to do times sleep the delay. And obviously, we're going to need to import time like so. And perhaps let's print some logging. There we go. So this is our simple pretty naive uh retry function. And then what we can do here is we have a joke and that equals retry. We will retry fetch joke and we'll leave everything else as the default value. So three times and one second delay. And then we're going to print the joke like so. Let's run this and see what happens. So immediately we get a good result. Let's try that again. Ah, there we go. We got two failed attempts and then finally the after three attempts it doesn't work anymore. So, as you see, it retries it until it reaches that end point. And as you can see, this doesn't always work. In some cases, it still fails because we set the maximum number of retries to three. But also, sometimes it actually helps. Like in this case, for example, the first time it failed, but then the second time it succeeded. So, it's a small change. I mean, this is pretty easy to add to your code, right? especially if you turn this into uh a function that's lives in another module that you can just import wherever you need it and already makes your code a lot more robust. Before I continue, let's talk about something that's closely related and that's reliable and accurate APIs. If you ever try to scrape data from Google or other search engines, you know how fragile that can be. One request times out, another gets blocked or the HTML changes and then your scraper basically breaks. That's exactly the problem SER API solves. SER API gives you access to Google, YouTube, and many other search engines through a really easy to use consistent API. They handle proxies, captures, and retries automatically and return well ststructured JSON so you can focus on using the data, not fighting the network. It's really easy to use. Here's an example in Python of how to access the Google search API using the SER API package that's available for free. Now, I just get the API key from an environment variable that I set in M files. That's why I'm doing load. M. Then I create a client, pass the API key, and then I run client.arch. I provide the query. I provide the target engine. In this case, that's Google search. And I also provide my location. And then I simply get back JSON data in a dictionary like structure that I can work with directly in Python like so. So this prints the first organic result. Let's run this and see what happens. As you can see, we get a dictionary here with all the information that we need. Or how about searching YouTube for content about the solid design principles in Python? It works almost exactly the same way except of course we change the engine to YouTube. And then we can print the video results. And here we go. We have as a first result my video on Uncle Bob's solid principles. If you want to make your data pipelines more reliable, check out SER API. The link is in description or you can scan the QR code on screen. Now back to the video. So we have this basic retry function, but retrying too fast can make things worse. You might overload a service or get rate limited. So just passing a delay like I did here may not be enough. Now a solution that you commonly see used in order to address this issue is exponential backup. And that means that each retry waits a little bit longer. So how do you add this? Well, first of course we need a backup value. So let's say that's going to be 2 seconds like so. And then instead of just sleeping for the delay, we're going to compute the sleep time. And that's going to be the delay times and then back off to the power of attempt minus one. and we do minus one because then in case of the first attempt this is going to be back off to the power of zero which is one. So we just have delay as the sleep time and then it exponentially adds the backup every time we need to retry. So that's the sleep time and then of course we also need to pass that to time do sleep like so. And finally, for debugging purposes, let's also print a message here. We're trying in sleep time. Let's format this to one decimal seconds. Like so. Let's run this here. Clearly, this works. This also works. These jokes are really bad anyway. Doesn't matter. Okay. Let's increase this number a bit so we can see what is actually happening. And let's also increase the number of retries to let's say 10. So we can see what actually exponential backoff is doing. I'm not going to run this completely until the end, but you can clearly see that the exponential backoff is working appropriately. And of course, it's up to you as a developer to decide, hey, how much back off do we actually allow for? Because at some point, well, you're just going to have to give up. Instead of a function, you can clean things up even more by using a decorator. And this lets you add retry behavior to any function with just a single line. Before I implement the decorator, later on in the video, I'll show you a case where it's actually not the best design. So stay tuned for that. So what I'm going to do is to define a retry decorator. This is also going to be a generic function. And this is going to get these arguments. So we still need the retries and the delay and the back off just like before. But then inside this function we will have our decorator function. Let's call that decorator. And this doesn't need to be explicitly generic because the top level is already generic. And then within the retry decorate we will have this decorator function. And this is going to get as an input a function. So that's the function that the decorator wraps around and that's a callable that we don't know the input arguments of. It doesn't matter. And it returns something of type T. And the decorator is also going to return a callable that also returns something of type T. So this basically means that the decorator doesn't change the signature of the function. But we're not there yet. If we simply define the decorator like this, actually we're going to break introspection. So if you would type the decorator on top of fetch joke and you would try to print the name of the fetch joke function, you would actually get the decorator function name. So in Python, that's why it's actually best practice to use a wrapper and that's from fun tools. So we're going to wrap around the function and then we will define a wrapper function within the decorator function. And I'm simply using the any type here because I don't really use these arguments. So let's keep it simple. And the wrapper is actually going to return this t type. Going to indent this. This is going to call the function. Here we're going to return the wrapper. And there we will return the decorator like so. So this is our decorator. It looks a bit convoluted but it's basically a function that wraps another function. And there's an extra level so we don't break against respection. And now the way we use this is simply write retry decorator like so above the function that we want to decorate. And that also means that in our main function we can now call fetch joke just like a normal function except that it's actually decorated but we don't notice that when we actually try to use it. So let's see what happens when we run this. It looks like we had an error here and I think the problem is that this actually needs to have parentheses because there are arguments involved. Let's try that again. Yeah, there we have the retry decorator at work. And we end up with a pretty good Chuck Norris joke actually. Now, a retry decorator like this is not just useful when you're dealing with REST APIs. So, in particular, if you're using LMS and you want to get JSON data from them, well, that's something that breaks quite a lot actually in my own experience. So you basically tell the LM to return valid JSON and then it just replies sure here's your JSON followed by something that's well it's not JSON. When your code interacts with LMS the retry pattern is actually incredibly useful. So here you see an example of how you would use this exact same decorator that I have here to wrap around a function that calls an LLM. So in this case I'm using uh OpenAI. I have an API key and then uh what this simply does is uh it's an assistant that extracts information from text and returns it as JSON data and the input is to extract the username and age from a text. So in the main function I have this text and then I call this function with retry because on top of it we have the retry decorator. So let's run this. As you can see it successfully parses the JSON. Actually most times this goes well. So I can't easily show you an example where this breaks but actually in some case the code did retry a couple of times while I was testing this. So LMS that's a really good use case for the retry pattern. Now in production you could even improve this further by adding a repair step between retries. For example, you could wrap the response in braces or validate it against a schema before you're parsing it. But already this simple version of the retry pattern turns unreliable API behavior into more stable software. Now Chuck Norris obviously doesn't need retries to get valid JSON. His Chuck Norris LLM simply never fails. Now sometimes retrying the same operation isn't what you want to do. For example, if a service is completely down or your credentials are invalid or whatever, retries just waste time. In that case, you might want to use a fallback strategy. Basically trying a different route. Let's go back to our toign ignores API. So we have fetch joke which fetches the joke from the API. But what we can also do is introduce let's say a fetch backup API that returns a backup joke. Now what you could do in the main function is try to do this and if there is an exception actually do that. But you can also integrate the fallback strategy into the decorator. For example, here what you can do is create a backup function. And this is also going to be a callable that will have the same signature as the original operation. And then what we can do in the decorator is if we have some exception or if we reach the number of retries then we're simply going to return the call of the backup function like so. And here there is another one. So we will do exactly the same thing. Like so. And what you can do now in the fetch joke is that we can state that our backup function is actually the fetch backup API. And now let's increase the failures like so. So that we can see this in action. So here you see after a couple of tries we get three failed attempts and then it automatically calls the backup API. Like I mentioned earlier, the decorator may not always be the best design. And here you see an issue in that the backup function is treated differently from the main function. And that means that this design doesn't work if you have let's say two backup APIs or you want to try twice the normal API and then you want to try the backup API and then maybe a second backup API or whatever. So it doesn't give you a lot of control and that's partly because the backup function here is treated differently from the function that you're calling here while in essence there's no need to do that. So what I want to do to show you a slightly different design is go back to that simple retry function. So this is not going to be a decorator anymore and I will remove all of this. This is simply going to return something of type T like so. And then here we're going to de-indent this. This is no longer a decorator. And we're going to delete this as well. So now it's again a bit shorter. Of course, now we need to pass the operation again that we're going to call. But we're going to do something different here. So instead of passing the operation and then a backup function, maybe other functions, we're going to pass a list of operations. And each of these operations is going to have exactly the same shape. And now what we can do is we can simply provide a list of functions that the retry decorator should call in order. And we don't need to distinguish anymore between what is like a main API and a backup API or a second backup API. So this I'm going to delete. And we don't even need to know the number of retries because that's simply the length of the list. Now you might think, hey, but then I have to write the function name three times. Actually, no, that's not necessary. I'll show you in a second how that works. So we have our operations and we can now also drastically simplify the code. So in this case, what we can do with a for loop is that we can iterate over these operations. I still want to have the attempt number. So I'm going to use the enumerate syntax here. So we're going to enumerate over the operations. So we're going to get the attempt and the operation. And then I'm simply calling the operation like so. And then if there is some exception, so we simply write the attempt number. This is not needed because that's going to be caught by our loop in any case. Of course, we still need the sleep time. And then what happens after the sleep period is over? then we actually go back to the next iteration in the loop. So here what we can do at the end is basically raise a runtime error that the retries have failed like so. And now what I can do is do a retry and then I'm going to simply supply the operations. So, for example, if I do it like this, we're simply going to try fetch joke only once. Let's run this. There we go. One single retry. But if we want to do this three times, I'll simply multiply this list by three. Let's try that again. And now we get the results that we expect. And actually, this looks like it's an actual Chuck Norris API failure. What is happening? Ah, interesting. Okay. Anyway, this is a really easy syntax to basically specify that we want to call fetch joke three times. And then if after three times we want to try the backup API, we can simply add that to the list. Like so. Let's run this again and see what happens. Finally, it calls the backup API. So even though the decorator is an interesting design, it also makes a distinction between in this case the main API call and the backup API call and you may not want that. For example, you could even take this a step further and after the backup API if that also fails, we could retry fetch joke again. We have a lot of flexibility here. So it depends on where you want flexibility in this pattern. Do you want to use it in a very simple way with existing functions and a decorator works really well or do you need a lot of control over how it's going to retry things and then maybe the decorator is not an ideal solution. Now I created all these retry decorators and functions from scratch. You don't need to do that. There are libraries for that in Python. A common one is tenacity which handles all of these things including exponential back off selective retries logging. So when you're dealing with production code, you don't need to reinvent the wheel. Here's an example of how tonasti works. So it looks very similar to what we had. Uh you can specify uh waiting times uh when it should stop etc etc. And there's actually quite a lot of flexibility in tenacity. So don't build this yourself. I purely did this to explain how it works. Use a library like tenacity instead. Now when is the retry pattern a good solution? Especially if you know that the error is temporary. Let's say a network hiccup or rate limits or maybe some LM failure. It's a really great solution for that. Also, you should make sure that the operation is actually safe to repeat. So, it shouldn't matter that you call something two, three, four times. And finally, it's a good solution if you want a better user experience or very resilient backend code that works with APIs that may not always be predictable. When do you avoid it? Well, in particular, if you know that the failure is permanent. For example, if your API key is wrong. Well, in that case, it doesn't matter that you retry because the key is always going to be wrong. Or maybe the user has supplied invalid input. Well, then retrying also doesn't make sense because the input is invalid. It doesn't fix that magically. Um, another issue is with a retry pattern is you want to be careful that your action, your retry action doesn't have some sort of side effect like uh storing something in a database or whatever because then maybe the operation fails but you may get all sorts of duplicate data and that's of course something you want to avoid. Also, you want to avoid retrying too often. So, you end up with some sort of retry storm that just makes the problem worse and worse. So, it's a great pattern, but use it with care. Like I showed you, you can even combine this with fallbacks and then your systems become much harder to break or even use circuit breaker, which is another pattern that I also want to explore on the channel later on. But to me personally, it's especially helpful when I'm dealing with LMS. Now, if your like button fails, don't worry. You can just retry it, maybe even with exponential backoff. Anyway, I'd love to hear your thoughts. Do you already use retry logic in your projects? Have you built fallbacks like I've shown you today? Let me know in the comments below. Now, if you want to explore more software design patterns like this one, check out my design patterns playlist right here. Thanks for watching and see you next
Original Description
👉 Get real-time, search result data from Google, Youtube and more with SerpApi: https://serpapi.link/arjan-codes.
This video shows you how to stop your Python code from crashing when APIs, networks, or LLMs fail at random. I walk through the Retry Pattern step-by-step: starting with a flaky example, adding simple retries, improving them with exponential backoff, turning the logic into a clean decorator, and finally adding fallback routes when retrying the same thing no longer makes sense. You’ll also see how retries help when working with LLMs that sometimes return invalid JSON. By the end, you’ll know exactly when to retry, when not to, and how to make your applications far more resilient.
Design pattern playlist: https://www.youtube.com/playlist?list=PLC0nd42SBTaNf0bVJVd9e2oBV-mcUuxS0
🔥 GitHub Repository:https://git.arjan.codes/2025/retry.
🎓 ArjanCodes Courses: https://www.arjancodes.com/courses.
💬 Join my Discord server: https://discord.arjan.codes.
⌨️ Keyboard I’m using: https://amzn.to/49YM97v.
🔖 Chapters:
0:00 Intro
0:53 The Problem: Transient Failures
1:35 A Simple Retry Function
7:14 Exponential Backoff
9:32 Using a Decorator (with @wraps)
12:53 LLM Example (JSON with the New API)
14:48 When You Shouldn’t Retry the Same Thing
21:06 Production-Ready Option with Tenacity
21:47 When (Not) to Use the Retry Pattern
23:15 Final Thoughts
#arjancodes #softwaredesign #python
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from ArjanCodes · ArjanCodes · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Full stack WEB DEVELOPMENT in 2021 - the ULTIMATE tech stack for FAST web app development
ArjanCodes
FROM PRODUCT IDEA TO SOFTWARE - turn your idea into reality in a few steps
ArjanCodes
Cohesion and Coupling: Write BETTER PYTHON CODE Part 1
ArjanCodes
Build a GLASSMORPHISM React Component - Typescript & Material-UI
ArjanCodes
Observer Pattern Tutorial: I NEVER Knew Events Were THIS Powerful 🚀
ArjanCodes
100% CODE COVERAGE - Think You're Done? Think AGAIN.☝
ArjanCodes
Two UNDERRATED Design Patterns 💡 Write BETTER PYTHON CODE Part 6
ArjanCodes
1000 Subscribers! 🚀 WHY I Started this Channel and WHAT'S NEXT
ArjanCodes
Channel Trailer ArjanCodes - March 2021
ArjanCodes
Exception Handling Tips in Python ⚠ Write Better Python Code Part 7
ArjanCodes
Monadic Error Handling in Python ⚠ Write Better Python Code Part 7B
ArjanCodes
GW BASIC Games I Wrote When I Was a Kid 🎮 Running 30 Year Old Code
ArjanCodes
Why You Should Think About SOFTWARE ARCHITECTURE in Python 💡
ArjanCodes
Uncle Bob’s SOLID Principles Made Easy 🍀 - In Python!
ArjanCodes
QUESTIONABLE Object Creation Patterns in Python 🤔
ArjanCodes
If You’re Not Using Python DATA CLASSES Yet, You Should 🚀
ArjanCodes
CODE ROAST: Yahtzee - New Python Code Refactoring Series!
ArjanCodes
7 UX Design Tips for Developers
ArjanCodes
Going All-in on Software Design in Python + an ANNOUNCEMENT 🎙
ArjanCodes
🎙 Interview with Sybren Stüvel, Developer @ Blender 3D
ArjanCodes
Do We Still Need Dataclasses? // PYDANTIC Tutorial
ArjanCodes
7 Python Mistakes That Instantly Expose Junior Developers
ArjanCodes
Answering Your Most Frequently Asked Python Questions // Q&A 07-2021
ArjanCodes
GitHub Copilot 🤖 The Future of Software Development?
ArjanCodes
More Python Code Smells: Avoid These 7 Smelly Snags
ArjanCodes
Test-Driven Development In Python // The Power of Red-Green-Refactor
ArjanCodes
5 Tips To Keep Technical Debt Under Control
ArjanCodes
Refactoring A Tower Defense Game In Python // CODE ROAST
ArjanCodes
The Factory Design Pattern is Obsolete in Python
ArjanCodes
Why the Plugin Architecture Gives You CRAZY Flexibility
ArjanCodes
Refactoring A Data Science Project Part 1 - Abstraction and Composition
ArjanCodes
Refactoring A Data Science Project Part 2 - The Information Expert
ArjanCodes
Refactoring A Data Science Project Part 3 - Configuration Cleanup
ArjanCodes
Purge These 7 Code Smells From Your Python Code
ArjanCodes
Running A Software Development YouTube Channel
ArjanCodes
Refactoring A PDF And Web Scraper Part 1 // CODE ROAST
ArjanCodes
Refactoring A PDF And Web Scraper Part 2 // CODE ROAST
ArjanCodes
How To Easily Do Asynchronous Programming With Asyncio In Python
ArjanCodes
The Software Designer Mindset
ArjanCodes
NEVER Worry About Data Science Projects Configs Again
ArjanCodes
Powerful VSCode Tips And Tricks For Python Development And Design
ArjanCodes
8 Python Coding Tips - From The Google Python Style Guide
ArjanCodes
What Is Encapsulation And Information Hiding?
ArjanCodes
8 Tips For Becoming A Senior Developer
ArjanCodes
Building A Custom Context Manager In Python: A Closer Look
ArjanCodes
GraphQL vs REST: What's The Difference And When To Use Which?
ArjanCodes
You Can Do Really Cool Things With Functions In Python
ArjanCodes
Announcing The Black VS Code Theme (Launching April 1st)
ArjanCodes
7 DevOps Best Practices For Launching A SaaS Platform
ArjanCodes
Refactoring a Rock Paper Scissors Lizard Spock Game // Code Roast Part 1
ArjanCodes
Refactoring a Rock Paper Scissors Lizard Spock Game // Part 2
ArjanCodes
Things Are Going To Change Around Here
ArjanCodes
Dependency Injection Explained In One Minute // Python Tips
ArjanCodes
How To Setup A MacBook Pro M1 For Software Development
ArjanCodes
A Simple & Effective Way To Improve Python Class Performance
ArjanCodes
How To Write Unit Tests For Existing Python Code // Part 1 of 2
ArjanCodes
How To Write Unit Tests For Existing Python Code // Part 2 of 2
ArjanCodes
Make Sure You Choose The Right Data Structure // Python Tips
ArjanCodes
5 Tips For Object-Oriented Programming Done Well - In Python
ArjanCodes
Next-Level Concurrent Programming In Python With Asyncio
ArjanCodes
More on: LLM Foundations
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
How We Translate 300-Page Books Using Claude Without Hitting Token Limits
Dev.to · 龚旭东
Building HITL Feedback RAG: Embeddings, Retrieval, and Reranking
Medium · AI
Building HITL Feedback RAG: Embeddings, Retrieval, and Reranking
Medium · LLM
A simple way to test model fallbacks with RouterBase
Dev.to · routerbasecom
Chapters (10)
Intro
0:53
The Problem: Transient Failures
1:35
A Simple Retry Function
7:14
Exponential Backoff
9:32
Using a Decorator (with @wraps)
12:53
LLM Example (JSON with the New API)
14:48
When You Shouldn’t Retry the Same Thing
21:06
Production-Ready Option with Tenacity
21:47
When (Not) to Use the Retry Pattern
23:15
Final Thoughts
🎓
Tutor Explanation
DeepCamp AI