Deploying Python on AWS

DataCamp · Intermediate ·☁️ DevOps & Cloud ·8mo ago

Skills: ML Pipelines80%Tool Use & Function Calling70%Prompt Systems Engineering60%

Key Takeaways

The video covers deploying Python on AWS, focusing on serverless computing using AWS Lambda and Gradio for machine learning app development. It also explores the use of AWS Bedrock, Light LLM SDK, and CDK for infrastructure as code.

Full Transcript

Hello everyone and thank you for joining today's session. My name is recent. I'll be your moderator for today. We're going to get started at the top of the hour. We're just waiting so everyone has a chance to join. Uh if you haven't already, make sure that you register for this session. You can do so. Uh there's a link in the YouTube description. If you're watching on YouTube, I would recommend watching on YouTube uh rather than LinkedIn. Uh but feel free to watch from wherever you like. Um yeah, there is a link to register for today's session. If you can't find the link, you can head over to datacamp.com/webinars. I'll also update the the QR code that we've got on screen as well. So yeah, you can register for today's session. Uh also, if you've already registered, you will have noticed that uh we've got a GitHub repo that is the link to today's session. And so that's what we're going to be working from. If you do plan on coding along with us live, uh you are going to need to check out the readme in the GitHub repo. So please do check out that. There is a link to that uh in the comments. I will post it again very shortly. So yeah, if you want to go along with us live, you're going to need an AWS account that's active and yeah, you're going to need to check the read me for the setup instructions as well. Uh so yeah, if you don't have that set up, I would consider this more of a watch along. Obviously, you'll get the recording in your inbox if you've registered for the session and that'll come through tomorrow. So, yeah, please keep your eyes peeled for that and make sure that you register for the session. Uh, aside from that, we are going to be taking your questions. So, if you have any questions at any point throughout the session, let us know in the chat. We're going to be running through your questions uh for the last 10 minutes of the session. So, make sure that you stick around for that as well. Uh, I'm just going to repeat these messages for anyone that's just joined. So, hello if you've just joined. Welcome to the session. My name is Ree and I'll be your backend moderator for today. Uh please do register for today's session. That means we can send you the recording and the resources right to your email inbox tomorrow as soon as it is up in our resource center. Um and yeah, if you want to go along with us today, you are going to need an active AWS account. Uh there is a GitHub repo that we are working from. You can check out the GitHub repo uh via my comment in the chat. I'll be posting it again very shortly. There is also a QR code on screen as well. So yeah, check out the read me in there if you'd like to code along with us live, but also uh yeah, do check out the resources that we've uh put together for you and all the extra links as well. If you have any questions at any point throughout the session today, let us know in the chat. We're going to be running through your questions for the last 10 minutes of the session. Brilliant. I think that's it from me. So now I hand you over to your host for today's session, Richie. Richie, please take it away. >> Hi there, data scamps and data champs. This is Richie. Uh good to be back for a streaming session. It's been a while since we've done one of these. Uh all right, so uh kicking things off. Um if you are a data scientist or a machine learning scientist, then once you've done your analysis or you've created a model, that's often not the end of your work. So these days, your work often needs to be incorporated into other software or perhaps otherwise placed in a production environment in the cloud. So that's what we're going to learn about today. It's going about how to uh deploy code onto Amazon Web Services and this is one of the most in demand skills for data professionals. So very excited to learn the techniques. Uh now that's a whole career in itself. There's lots to learn about. So today we're going to focus on a sort of more narrow slice of that uh which is going to be uh around using AWS Lambda which is the serverless computing service. Say that 10 times fast. Serverless computing is a bit of a misnomer. There are services involved. You don't need to care about them. I will let today's guest explain more about that. Uh the other thing we're going to be looking at is Gre Gadio which is a Streamlit fork. So this is basically a machine learning app development platform. So you get a user interface uh to your models. Our guest is Eric Riddok. He is the director of machine learning platform at the e-commerce platform pattern. Hi Eric. Uh great to have you here. >> It's good to be here. Thanks for having me. >> Wonderful. So uh Eric runs uh the machine learning engineering team and his focus is on building tools to allow data science teams to self-deploy uh data and machine learning products. He's got decade of experience uh both as a data engineer and a machine learning ops engineer and he also teaches MLOps at MLOps club. Uh wonderful. So uh with that uh please take it away Eric. >> Yeah, thanks. Um I'm sharing my screen. I have a slide up. I realized I I should probably introduce I'm my plan is to motivate uh how I got started with AWS because it'll I I'm hoping to get you excited about what it is we're building. And so this this is why it would have been exciting to me early on. So anyway, so my name is Eric Red like Richie said and I went to school at Brigham Young University and I studied math and I really struggled to get a job doing anything after studying math. Um, and so like I I think many of us have found ourselves in the position where you have to go out go beyond what school is asking you to do in order to be enticing to someone who would hire you. And so I finally did get a job as an analyst at a company called Inside Sales. Um, and I coming from math, I thought that was so beneath me. But you know what? That's where I learned that the majority of value in data comes from basic counting. like how many things did we sell or how long has it been since we reached out to customer whatever. So um so it was it was a healthy experience and uh there then I then I moved to working at MLOps at a company called Ben Labs and you know if you've seen Into the Spider-Verse uh there's all these different Spider-Man and one of them has this VR headset and you can see she's holding a bag of Fritos and we actually are the ones who placed that that product that we did that product placement into the Spider-Verse and like we're also the reason that Bumblebee is the Transformer as a Chevy Camaro and we're the reason Tom Cruz was Ray-B band sunglasses and maybe a whole bunch of other brands, you know. Um, and we we also had a Chrome extension to help YouTubers succeed on YouTube. And one of the things, one of the features I worked on was like this regression model where we'd take a YouTube thumbnail and we'd guess what the click-through rate would be. And so you could plug in all these variants of a YouTube thumbnail and we'd try to help you like rank them in order of like most clicked to least clicked, you know, guessing. So we created an ML platform and we went and presented at Netflix about how we built it. Um, that was great that we did that because then I got laid off and uh I think the the prestige from giving a talk at Netflix definitely was a key part of me getting my next job. But in between I started demo ops club and made a course about AWS because I had noticed working with data scientists that a lot of data scientists don't understand what they're let's see like like I I think they're comfortable in a notebook creating a PC or you know you you talk about AI engineering too they're comfortable like creating maybe like a chain lit app like a little chat application locally but they can't but so many people get stuck on actually deploying something to the cloud and and I was hoping that would be like the sugar the spoonful of sugar hey we're going to show how to deploy your stuff that we would bundle with a bunch of medicine which is like and by the way uh I also want to make sure you know good practices that make you a co-orker that I don't hate like testing and and uh and clean code and that sort of stuff. So uh so yeah then I joined Pattern. Pattern's the biggest seller on Amazon besides Amazon themselves and um we basically run people's e-commerce for them and there's a whole bunch of data science use cases there. So, so the thread through all this was the thing that really helped me get started uh was a personal project that I did uh that made Ben Labs want to hire me. Um, and it was basically a giant AWS architecture. Um, and this was a language learning app. And the thing we're going to deploy today is a pretty decently sized component of this architecture, probably better than I did it at the time. Um, so so um here's the plan. Here's what we're going to code. um and and deploy. So if you've heard of Streamlit before, I think I think more people have heard of Streamlit than Gradio. Streamlit and Graddio are both uh tools in Python, these libraries that help you create a UI really quickly. And they're really popular for demos. And so Streamlit and Graddio have two use cases, I would say. Like one is personal projects. If you just want to get a quick demo, like if you Yeah. If you want to like deploy something for yourself, you know, show an employer or just show your friends like uh a chat app or something that you built, you can easily create a UI with Streaml and Grado in just Python. But also, a lot of companies use Streamlit and Graddio as well for internal apps. So, you can kind of be a hero if you're an AI engineer or a data scientist and you create a Streamllet app that you send some b some business users in your company that lets them like interact with your models. So uh so this will be sort of like I I think this project is going to be a nice um going to be a useful thing for two groups of people like people who are already working and want to deploy internal apps and people who would like to get a job and and be able to deploy their own personal project. Um and full disclosure I will say that I was hoping to impress you by uh by making a chat application kind of like kind of like chat GBT and it does work is lo it definitely works locally. I think this chat interface with Graddio does struggle on AWS Lambda where we're going to deploy this. So yeah, chat applications in particular might be the the main like something that Gradio struggles with, but you know, simpler things like buttons and forms and fields you'll probably have a better time with. So okay, so with that said, um let's start from the architecture and then I'll start a fresh project and just kind of live code it with you. And uh let's see, Richie, do we have the ability to have questions come in live like live stream style? Uh yeah. So uh the audience can ask questions in the chat. Uh you can either answer them as you go or we can wait till the end to answer them as you wish. >> Sweet. Yeah. Um maybe use your discretion, but like I would say I would feel less lonely if if I saw stuff coming in like uh that I could react to real time. So >> okay, note to the audience then. Ask as many questions as you like. Uh please uh stop Eric feeling lonely. >> Nice. And I I can always just ignore you. So like um anyway so okay so here's here's the plan here's the plan um here here's my my architecture diagram here. So um so AWS Lambda if you haven't heard of it uh it is a a serverless function a cloud-based function service and it has a lot of competitors. Google Cloud has one, Azure has one, Kubernetes has one. There's there's a lot of there's a lot of equivalent tools out there. So if you're learning how Lambda works, you're learning how like a whole class of serverless tools work. So basically what's going to happen is we as a user will go to our browser. We'll go to this URL where our app is going to be. Then our browser is going to it's going to hit AWS. It's going to hit this AWS Lambda service. Um and it's going to say, "Hey, hey Lambda in AWS. Uh give me whatever app you you I have registered under this URL." And so then the AWS Lambda service is in a just in time fashion going to reach out real quick and download our code. It'll it's going to download our code into this like runtime and then run it once and then dispose it. So like so the thing about um a lot of recipes out there for deploying like a deploying an app involve paying for a long running service like you're paying for something to run around the clock. So maybe it's like 4 cents an hour and you know what is what is let's see. So 4 cents times 24 hours times 31 days in a month is $30 a month for something that costs 4 cents an hour. So you know even for a personal project like when I was in college $31 a month would that would have felt heavy to me. That still feels needlessly heavy to me. Um with Lambda with this architecture you only pay if people are actively using it. Uh and let's be honest most of our personal projects sit and don't get used 99.9% of their existence. And so this will be virtually free. I mean, I have I have serverless apps that I deployed years ago that I'm maybe getting charged like two cents a month for just to sit there and exist. So, um, so anyway, this is this is very cheap. Um, oh, and by the way, the bugs that I talked about earlier, like if you want to do a chat interface with radio, those seem to show up on Lambda, if you want to pay for a version of this that does run around the clock, those all those that this becomes a non-issue. So I think I think something about the serverless nature of Lambda where it like disposes your environment is is what causes the bugs in the chat interface. Anyway, so yeah, so what our Lambda function will do though is it'll basically serve us this HTML like this this UI uh the thing that you see here and then when we interact with the UI like put messages in this search bar sorry in this chat box um then the Lambda function will have some Python code in it that reaches out to AWS bedrock and hits an LLM and you know and then we're having a conversation. So that is the that's the plan. That's the plan. That's the architecture. So at this point I'm just going to start coding. Uh, I would wager I I would venture to say that this will probably be fast-paced enough that if you try to follow along, you might get lost, but it but it will be followable. So, definitely if you want to recreate this after the live stream, I So, what what what I would recommend is just ask questions and pay attention and learn during the live stream. Uh, if you try to follow along live, you probably will get lost. And so, maybe maybe recreate this after uh when we when we publish the recording of this. So, and of course we the there are resources here and there's there's an answer key you can use to get this running as well. So, let's get started. Um, so I have here a completely fresh blank uh project in cursor. Um, which is a terrifying place to start. Uh, so the the place I always like to start when I'm making a new project is uh with UV initializing uh a basic Python app. So that's what we're going to do. We're going to make a a UV app. So, I already have UV installed. Um, if you don't know, I mean, I hope you know, but if you don't know, UV is like the greatest thing that ever happened to the Python ecosystem, and it's essentially replaced pip as well as several other tools. And so, one of the things UV can do for us is get us some boilerplate. Um, so here is So, I'm going to run this UV init command. Uh, I I want to make a library and I want to use Python 3.12. And so now that I'm running this, it's going to generate a bunch of boiler plate that we can use. So it gave me this src folder um with uh an init py file. So you know, fresh Python application. And one thing that one thing that I really like about using lib is it makes it particularly easy to have imports work between files. And I think we will well actually for this application we're probably only going to have one file. But you know, if you wanted to add more and make it more complex, this is the right this is probably the best uh scaffold for that. So um let's see here. So I'll run uv sync and what that did is what UV had created here was a package. UV just installed that package and it created a virtual environment where it'll go. So all right that was step one. We got our boiler plate project. So now let's uh let's make a app. Um so actually I'm so sorry. First thing we should do before we do anything, uh there is later on in this uh in this live stream, um we're going to be hitting AWS bedrock to get uh to get responses from its LLMs. There's actually we actually have to ask for approval from AWS to have access to one of its LLMs before we can hit it. So, I'm going to go submit uh submit my request for that right now so that in case it takes a few minutes for them to like grant approval that by the time we get to that point in the stream, it's it's there. So um so sorry quick quick break while we go do that. So uh before the stream I created a fresh AWS account. Uh so I will go to that account and um let's see my font's probably small. So if you go to the AWS console uh go to the Amazon Bedrock service and in Amazon Bedrock go to model catalog and in model catalog there's a whole bunch uh there's a list there's basically a registry of foundation models that you can hit via the AWS bedrock service. Uh I want to hit claude probably let's go cloud sonnet. So um so I've scrolled down to cloud I could I could search for it. I mean you could really pick any of these. I'm going to do this one. Um, so click the three dots, modify access. Let's see. Oh, I did notice this. This yellow box has been here for a little while, but hopefully this whole step of requesting approval may not even be required after a while. So, um, okay, let's go. Claude, just need Okay, available to request. We need to make that request. How do we do that? There's probably let's see enable specific models. Okay, Claude. Oh, here we go. Check the box. Scroll to the bottom. Nice. Enter company name. MLOps club. I hope they grant my request for this given that, you know, I'm presumably bringing them sales. I always worry that they're not going to let me use their LMS. Okay, this is going to be for internal employees only. Okay, next. submit. Okay, so my request here is in progress. So hopefully by the time we get to that spot in the talk um we'll have access to this. So awesome, awesome, awesome, awesome. Okay, let's start working with Gradio. So um I'm going to Google search um build a chat application with Gradio. And one of the first things that should come up, I think since I've seen it before, it's the first thing for me, but creating a chatbot fast. This is actually a really good guide. Um, so Gradio, of course, with the hype of of LMS and chat interfaces. They have a pre-made, highly opinionated component that helps you make a chat interface in uh like one line of code. So, it's pretty fantastic. Um, so I'll just give you a quick uh quick rundown of what they're doing here. So uh let's see. So this this right here is a completely valid uh grado application all by itself. So we're going to import the gradio library. We'll instantiate this thing called a chat interface. And all we got to do is call.launch on this and suddenly we'll find ourselves with a web server running on localhost that has a chat interface. Um and that chat interface whenever we chat with it and put things in the search bar, it's going to call this function, whatever function we pass here. And that will deter the function will receive our chat. It will receive all the previous chat messages and then it'll do something like what we're going to have it do is reach out to AWS Bedrock and like uh send the full history plus our new chat message to Bedrock and get a response. Um anyway, so let's uh let's try I want to grab a less trivial No, no, that's actually fine. We'll just take their basic code example and we'll go in here and inside of our app, I'm going to create an app. py and paste in their code. So now let's see if we can actually get this to to work. Oh, that's not defined. I'm going to copy copy the little sample function here. Cool. Nice. So they have a function that's taking in a message which I think is a string might be a dictionary and then uh a list of a list of dictionaries is what this this takes. Um and these the string or dictionary and this list of dictionaries represents the message that we just put in and then all the previous messages that were like prior in the chat as well. And we send all that stuff to the chat interface and potentially to the LLM every single every single time we chat. So um cool. So I'm going to try I'll I need to install the grado library. So I'll do UV add gradio and that should add gradio. Cool. It's installed. Um so now UV run uh let's see src gradio on lambda live UV is it struggles with its autocomplete unfortunately. I wish I could tab complete this. Um how are we doing? Do we have Cool. So, I got a little URL and it looks like it opened my wrong. Yeah, here we go. Nice. So, hello. Yes, is what it says. What? Yes. No. Yes. No. I was hoping to get it to say no. Okay. Um, nice. So, this is our chat interface. So, this is this is what we're going to be deploying to AWS. um you know hopefully with some smarts to actually hit an LLM as well. So um to do that uh so we need to hit an LLM and we need uh we need to let's see we we need to we need that hitting the LLM needs to happen inside of this function here. The one that the chat interface is using. Um, and I just want to point out really quick that like this function currently is returning a value. We could yield a value. So this this would make it so we could yield one thing at a time. If you use y if you use yield instead of return, um, this will allow us to sort of stream like get that effect where it looks like we're streaming in token by token the response into the UI. So anyway, so my plan is in this function, we'll use some library to hit an LLM and then get its responses back token by token and then we'll yield each of those responses back to the user. So that's the plan and I what I was thinking about how I wanted to do this because we we could use BTO3 because Bodto3 is AWS's SDK that you know you use to hit all the services in AWS but um but then I thought to myself like well what if um what if I want to leave bedrock at some point like I would hate to have to make extensive code changes to my application every time I want to change LLM providers. Uh and also I don't trust AWS to design an SDK that's ergonomic. I mean AWS is good at many things but it's also bad at many things and I would say one of those things is ergonomics and so uh so I wondered is there is there some sort of SDK where I can write code once but point it to any LLM provider and there are multiple and the one that I found that I got working was light LLM. So, LLM um light LM they have an enterprise service but they also have a Python SDK that's well-maintained. And what I wanted to show you is look how many um look how many LLM providers they support um and non and like image gen and and other kind of gen providers as well. Let's see. So, I'm on their GitHub page. I just Googled it and uh look at this table. they they support OpenAI, Metal API, Azure, a like tons of different services. Um, and because they because there's an enterprise offering behind Light LLM, I trust them to keep this working. I don't it doesn't I don't think this is a SDK that is at risk of uh one day having a significant breaking change because I think they do have customers that are relying on at least their enterprise service and presumably their SDK as a part of that. So that was my reaction when I discovered them. Um, so cool. Uh, I would really just like to vibe code this this bit. Um, so but I'll start at least by installing light LLM. So UV add light lm. Okay. Awesome, awesome, awesome. And let's do some vibe coding. So um please use light lm SDK to uh hit the claude 4.5 AWS bedrock uh LLM um to make the chat interface dynamic and actually plugged into an LM. Okay, while we're waiting for that to roll, let's go check our Oh, good. Our request has been granted. That's good. I noticed yesterday when I ran through this that you that my request was pretty quickly granted, so hopefully that's true of you, too. Um, and this next bit was confusing to me. So, shout out to my friend Jeremy Mumford who explained this. But um so how do we actually go about hitting these end points? So what I did is I went to model catalog and then I uh found claude 4.5 sonnet and let's see click on this. Okay so here's here's like this page in AWS for cloud sonnet 4.5 and it has a model ID here and so I was trying to plug in this model ID to light LLM and have it hit this model. It wasn't working. um we will need this we will need this model ID but there was another piece of information that we also needed for this to work and it's called an inference profile and from what I understand an inference profile is a separate concept in AWS bedrock that uh allows you to track the basically track your your calls to one of these foundation models so you can create a bunch of inference profiles maybe one for person or one for team or one for application so you can with at the granularity level of a particular profile track the number of invocations and the cost and so so forth that you're making to these models. So, um anyway, so you have to come down here to model access. So, I'll open that up in a new tab. And here, if we search for our model, it should it should um show us the inference profile. Let's see. Cloud 4.5. Yeah, we're here. Let's see. Oh, I'm sorry. I think it's in infer actually. Here we go. Cross region inference. Um I'm actually I'm I'm clearly not a bedrock expert. This is more just flavor for the for our deployment here. Um okay. So here is the model ID for clouds onet 4.5 and that's great. But also over here there's the inference profile ARN. So that's how you find um there's like a default inference profile I believe for each model. You can also create your and so we can use the default one to track all of the calls we make to this LM as sort of one unit. But if we had different teams or apps we could make our own inference profiles. So uh that's my layman's understanding of how that what what function that's doing. Let's see. So we are at 925. Okay. I got to go a little bit faster. Um, so let's see how are we doing here. We got got some generated code. Um, so we're importing light LLM completion chat with Claude and Okay, great. Yeah, it's uh we're calling this completion function and we're passing in this model ID. Um, I already think this is probably wrong and that's fine. Um, so let's pass let's go grab this one, this model, and pass this in here under the model function. And then there's actually another field we have to set called model ID. And this is where we pass the inference profile. Um, great. Okay. So, excuse me. So for this to work um I will need on my local machine I will need uh access to an AWS profile. So I am going to let's see do AWS SSO login. So I'm going to create a profile called MLOps club and I'm going to be using that profile. Um great. So you can't see this, but on a browser there was this whole handshake that just happened and now now I'm I'm logged in via the AWS UI and those those login credentials have been persisted into an AWS profile. Um you don't have to create a profile that way. If you know how to create a profile using an IM role, that works too. So um this profile has pretty sweeping access. I give it administrator access to AWS even though the only thing it really needs to do at this point is invoke this model. Um later on I'll be using this profile for like deploying the infrastructure and that's why I made the permissions so broad for this stream. Okay so uh UV run and let's see oh I'm sorry I'm sorry I just ran UV run to run the app. I need to set as an environment variable my profile name. So MLOps club and hopefully this will make it so that this app is using these permissions. All right, we got a little URL. So, let's go look at this. Hello. Okay, error. What did we get? Um, a nice thing about the chat interface is you can, I believe, make it show errors. Let's see. Maybe that's actually in the launch command. Let's see. Show error true. Let's try that. Oh, no module named boto3. I just saw that uh just saw that error in the console. So, let's to UV add boto3 which is used under the hood in light lm to actually call bedrock. Um, okay. Let's try this again. I just I just made two changes at once, so we'll see. But what show error does is if we get an error, it'll actually show us the error up here in a in a in like a toast. But look at that. We have a chat application developed very rapidly. So, um, cool. Nice. Anyway, it's not it's not the world's fanciest chat interface, but hey, I mean, look how quickly we're able to get a chat interface going. Um, so fantastic. Uh, let us now deploy this thing to AWS. So the plan for that is um we're going to use AWS CDK. So we're going to AWS CDK as an infrastructure as code tool. Um there's there's a couple different options for infrastructure as code. There's Palumi, there's Terraform, there's AWS CDK, there's cloud formation, there's AWS SAM. So there's a whole bunch of options. It's fairly overwhelming. for anyone starting out and for for teams of a small size who are all in on AWS, I actually recommend AWS CDK every time because it's Python and it has some features built in that you would otherwise have to sort of like configure yourself with some expertise or buy from a vendor if you're using Palumi or Terraform. So, so to get a great experience enterprise features for free as long as you're all in on AWS because CDK is AWS specific, I would say CDK is the right choice. So just plug for CDK. That's what we're going to use. Um so let's keep all these and you know now would be a fantastic time actually to commit because we're in a working state. So let's initialize this as a git repo and get at all and get commit um local grado app working. Nice, nice, nice, nice. Okay. So um so with AWS CDK you need to you need to have Node.js JS installed on your machine and then you do something like AWS uh install global CDK or AWS CDK and this will um this will install the AWS CLI for you. Oh, maybe it's just CDK. Sweet. Awesome. Awesome. So basically you it's CDK is is interestingly enough a NodeJS project even though we're going to be interacting with it with Python. And so you you actually manage its installation using the Node ecosystem. So, I guess you could have used PNPM instead of npm, but you know, I'm just a I'm just a JavaScript peasant, so I'm not don't have that set up. Um, okay, cool. So, I'm going to create a single file called infrastructure.py infra.py. And I'm going to ask our vibe coding app to Okay, please um fill out this infra.y UI file with a CDK application and a single stack that include that uh deploys our uh gradio app as a lambda function. Um use the following docker file. So, there's a couple blogs out there that show well actually I'm just this is this is one spot where I'm going to cheat and and grab stuff from the answer key to speed us up. So, here's a Docker file that I wrote before. Um, and you might if if you've never used Docker, you might be like, whoa, this is moving fast because now we're just writing a Docker file, which is this Docker concept. And yeah, we're we're about to build a Docker image, write a Docker file, and deploy that Docker image. Um, and obviously we don't have enough time to explain how Docker works, but fundamentally this is going to take all of our it's going to take our Gradio app. It's going to take all our libraries for our Gradio app. It's going to package them into this zip file called a Docker image. And then that that zip file is going to be what Lambda like downloads and executes when it runs. So that's the that is the very short version of what's going on here. So um so let's make this docker file. Um and I need to change this little bit. Okay. So the main thing that's happening here is uh okay. So yeah, what's what's happening here? I'll I'll explain this copy statement if I get time later, but uh we are going to first install uh all of our let's see. Yeah, we're g we're going to first install all of the dependencies like boto3 and gradio and such for our application in this line and then we're going to copy all of the stuff in our source folder into this into this docker image zip file thing. Once it's all there when whenever we go to when we want to start the application what we'll do is we'll run python 3 uh and then we'll directly execute our app file and this is going to actually in this docker image in this docker container it's going to start the gradio server. Um, so and then there's there's what this line doing here is it's it's a separate process that's going to be running in the container. This is actually the entry point and it it's going to be acting as like this this proxy interface sitting in front of our gradu application. So it will it'll be able to understand basically uh lambda events or it'll understand the the payload that gets sent to AWS Lambda and translate that into a format that our that our Gradio server can understand. If that didn't make sense, uh we'll talk about some resources where you can go learn more. So, um all right, we have I think if we can get this thing deployed in the next seven minutes, then we are doing well and the and the the live is done. So, um let's see. So, yeah, um use a docker imagebased function and use the docker file in the project for that. Uh, note that this um well, we'll just start with that. Okay. So, let's make sure it has access to this context. I don't even know. I'm not a cursor expert, but I I assume this will help. Um, it it may already it may already be smart enough to get that. So, let's let's run this. Okay. Okay, so while this is running, we'll just talk about cloud form a little bit. So cloud form is AWS's infrastructure as code tool. Uh it is a a ser way to create large amounts of resources in AWS like buckets, roles, lambdas, containers, all that stuff. Um if you have a group of resources that belong to one app and are configured sort of as a unit, then infrastructure as code really is the way to go to define how those things are deployed. Um great. So let's see here. Oh, this is kind of funny. Okay, cool. So, um, quick quick like anatomy of a of a CDK app. So, at the very bottom here, uh, there's this concept called an app. So, we're making an app. An app in CDK is a set of stacks and a stack is a set of resources that get deployed. So, we have one app and on that app there is a stack and on the stack there are resources like the song. And uh and so um we'll the code to define the stack we'll see in a moment. But the stack is going to end up getting rendered as a JSON file. Uh it's cloud formation JSON that we didn't have to write ourselves. We're generating programmatically. That JSON file is going to be sent to AWS and it's going to represent our order of like here's all the resources I want to create. So um cool. So let's see what's in the stack here. So the stack has a couple different resources. So we have a docker image function. This is going to create a a lambda function and this is I I think this is absolutely awesome. So um the code for the function is this docker image asset and we're going to point to our docker file in our directory. So as part of our deployment process CDK is going to take care of actually building and pushing our docker image out to like the the image storage area in AWS where it'll be needed uh by where it'll be referenced by AWS Lambda. So this is going to take care of our Docker build and push steps for us, which is just awesome. Like in other infrastructure as code tools, you usually have to like manage this yourself. Um I I won't go really into these other settings here except to say that uh I would like to build this for ARM instead of x86. So let's see. Can we do ARM 64? Hopefully. And then platform This is just for this for sake of time, I'm going to cheat here. Um, let's go to infer.py and let's go grab this. So, because I'm on a Mac, when I build this Docker image, um, my Mac is going to default to building this this Docker image for ARM. um whether I'm running on a Mac, whether I'm running my deployment on a Mac or a Windows, I don't want that to have to I don't want that to have to matter or I guess based on the CPU. I don't want the developers when a developer is running this program, I don't want it to be sort of random based on the CPU they have whether or not this works. So, I'm just going to force this Docker image to be built for ARM every time. And that's great because that ARM deployments in AWS are actually cheaper than x86. So, um let's see. So we got that and then we're going to do this. Okay. Um these environment variables are actually not necessary. So um one thing I will do though is when our container starts up when our application starts up I I I want it to start up on port 8080. Um and let's see. Yeah. Is that is that the right Yeah. Here we go. And I want it to listen on all IP IP addresses. So that's a little config that I just set in the gradio app. So now now it should play nicely with certain assumptions that our Docker image has to satisfy to run in Lambda. So um okay, so quickly going over this, we have it we have a function. It's going to build our Docker file and it's going to register that into a Lambda function. Um we don't need these environment variables. Um, if we wanted to, we could make we could parameterize our app so that maybe you could pass in different model ids via environment variables. That would be pretty cool, but I won't do that now. Um, let's see. So then then we actually modify the permissions of the role on our lambda function and we grant it bedrock invoke model for uh for for these resources like for the foundation models that we want to hit. Um, this could be a source of issues. We may need to like tweak this this role when we end up creating it. Um, it's also going to register our application like it's it's going to create something called a function URL with lambda which is going to actually give us a URL we can hit in our browser that will turn around and invoke the lambda function. So anyway, the function URL is basically what enables like this mode where you can access the lambda function as a website in your browser. So that'll be pretty awesome. Um, okay. Um, so what's the best way to proceed with this? All right, I got to move faster. So, I'm going to cheat and take this task runner script. I was going to tell you all about my opin my opinions on how I like to make task runners, but I'll just have to skip that part for now. So, here's a bunch of bash functions that are useful for running our application. And I have a couple of these functions that have to do with CDK. So when you're using AWS CDK, you actually have to before you can use CDK to deploy anything, you have to run this command called CDK bootstrap, which is going to create some essential resources in AWS the CDK relies on. So that's the first thing I'm going to do is run um let's see, run this bootstrap command. So do that. Run cdk bootstrap. Uh oh. Unbound variable AWS account ID. Yep. Okay, that's going. Um, another thing I want to do is I want to see if our CDK app is valid even. And so for that we can run this CDK synth command which is going to try to take our CDK code and produce the cloud for JSON out of it. So let's try that. It'll it'll also I think build our docker image as well to make sure our docker image is valid. So okay running CDK synth. Oh modul foundated with CDK. All right. Um let's do uv add-cript and these two uh these two. So I'm going to I'm going to add dependencies directly to our infrastructure. py file as comments. Awesome feature of UV. I'm gonna add these two libraries in CDKs. Cool. Let's try that again. It's giving me this warning message saying that it doesn't like the version of Node.js I have installed. I think that it um CDK wants you to have an exact version of Node.js like exactly 24.0.0 or exactly 22.0.0. So, it's it it in my experience it's never mattered. Um, yeah, if I can't get this deployed very quickly, then we might just have to turn this into a casserole in the oven demo where I just show you the version that I already have deployed, but we're on the we're on the home stretch. Okay, so our synth actually sent successfully. That's pretty nice. So, at least we know that our our generated infrastructure as code is valid. It's like wellformed. We don't know if it'll work. We have to deploy it for that. But this is this is definitely promising. So I'm going to run CDK deploy. Nice. So first thing it's going to do is synthesize our Python code into this. This synthesis step here is um what am I trying to say? is creating our our cloud for JSON and then the next thing it's doing it's actually building our docker image here. Um so that's fantastic and we will see further down if it succeeds to build. All right, I guess we'll give that a sec. And then the next thing we'll see is we'll see CDK pushing our Docker image which is nice. And okay, cool. So it looks like the bootstrap process finished. So good. Here's here it is pushing our image. All right. So essentially just like Terraform has Terraform plan and Palumi has like an approval step and Palumi up CDK won't just go create resources in our account without confirming with us that we like what's about to be done. And so it's highlighting, hey, here's a bunch of permission changes that are about to happen. We're going to create a RO. Are you okay with all this? Um yeah. Anyway, so I'm just going to say yes, I approve of these things. And while we're waiting for this, let's go over to the cloud for UI and I'll show you what's what's happening over here. So in the cloud formation UI, you can see stacks. I'm in the wrong region. So go to Oregon. So here's a stack called CDK toolkit. When we ran CDK bootstrap, this was created and that gave us things like an S3 bucket and a Docker image registry. So when we ran when you saw CDK pushing our Docker image, it was actually pushing our Docker image to a repository created in this stack. So this is like a like this is one of the quality of life things that CDK does for you. It makes it so you don't actually have to think about what what repository I'm storing my image in. That's sort of abstracted away from you and like behind under the hood it goes into this repository. And and here's our stack with create in progress. So we can we can click on the stack and we can see the resources that it's trying to create. Um, so can I this view is a little annoying. Okay, here we go. So yeah, it's creating like a it's creating a AWS Lambda function and a Lambda URL and and a Lambda permission and an IM role and an IM policy. All this stuff you didn't explicitly see get configured in in uh in CDK. And that's because similar to how Terraform has modules that are opinionated, CDK has constructs which are like sets of already opinionated configured resources. So when we created our Lambda function under the hood invisibly, we were also creating an IM role in a policy as well. So we didn't even have to really think about setting up those permissions, which is fantastic. Okay, so we're we're close. The create in progress is the the last thing that needs to be created is the function itself. So, we are working on that. Now, here's like a timeline view of, you know, each resource in the stack and and how long it took each one to be created. Um, let's see. Oh, create complete. Maybe are we done? Create complete. Okay. Well, moment of truth. Does our app work? If it doesn't, this is where we enter the debug loop to debug our infrastructures code and see the app. But it would be amazing if we can see it right now. It's taking a long time to load. Why is it taking a long time to load? It's because this is a Lambda cold start. So this is the first time we're hitting our Lambda function in a little while. And so right now Lambda is actually it's going to take about 10 seconds, I'd say, to Lambda has to download our Docker image so that it can run it. Um, oh no. Oh no. What was that? What was that download link? That is absolutely not what we want. Okay, we don't have time to debug this um because we we're really overdue for Q&A time on the stream. So, I wish I could I wish I could say that, you know, we got the Lambda function fully deployed. I could probably cheat and go grab In fact, maybe I'll go do that right now. I think I'll go just deploy the the working version. >> Yeah, we got to see the working version. So at least uh at least deploy that. >> Cool. Let's do it. So let's uh I'll go to my my answer key. So let's go run see play. Okay. So while this is running, let's uh let's take some questions. >> Um all right, super. Actually um before we get to that uh I just want to talk through uh what's happening with uh the next few uh webinars cuz uh they can get slightly tricky. So actually uh next week Reese and I are away so no sessions for you next week and I'm very sad you're going to have to watch reruns or get a hobby otherwise entertain yourselves. After that we've got um AI agents. We've got two weeks worth of content purely around agents. We got nine webinars over the two weeks. It's going to be uh intense and a lot of fun. Uh I think Reese has put a QR code there on screen. So if you want to register for that, it's uh one form to register for all nine sessions. So uh please do come back for that. Uh in the meantime, yeah, let's go to some questions. So um all right, so there are a few people asking around like why are we using radio rather than I guess Streamlit or any of the other sort of platforms for this? >> Yeah, great question. So, Streamllet requires you to run in like a container setting. So, you have to pay for it around the clock. So, Streamlit is a fantastic interface and we use it at work because we're not afraid to pay $30 a month for running a container, you know. Um, the nice thing about Gradio is I believe it's actually built on fast API. So, I guess there's this concept of websockets where Streamlit requires the browser to establish a websocket connection with the Streamlit container and in order for that to work the container has to sort of be persistent running all the time. Grady doesn't rely on websockets and therefore it can be run in this like serverless setting. >> Okay. So that seems very useful if you're like an individual and $30 a month is like you know I mean it's it's not nothing. Uh so uh yeah so if you're only using the thing intermittently then uh that seems like a good option actually. So there's a a related question here from um from Evangel talking about like why using lambda here uh because the other option is like a a low resources VM and so >> in fact I know hundreds of services on AWS. Can you talk through like which services you might want for like different use case patterns? >> Yeah, so Lambda is like the only one that gets you this serverless I only pay when I use it thing. Um oh hey look it turned on. Let's uh let's try setting a message. Hello Oh, none type has no attribute get. We'll see. What's funny is what's funny is I think this is actually an issue with grado on lambda and I think we'll actually see it intermittently work here in a bit. So anyway, we'll give it some time. We'll let it cook. Um >> I I feel like there's some lessons around vibe coding here and thoroughly checking code, but I it's very tricky in a live situation. >> Absolutely. >> Okay. And we we you were talking about um the >> different services for different use cases. >> Yeah, Lambda is like the one service in AWS that gives you this serverless um like I only pay for it when I'm actively making a request. If you if you spin up a VM, you're going to be paying for the VM to run all the time. Make do containers. So that's like ECS, AppRunner, uh Elastic Beantock. All these things are going to require you to like pay for something around the clock. Um yeah, those are sort of the options. Kubernetes is of course even even worse. you know, you pay for a expensive control plane and containers. So, >> okay. Yeah. So, I guess there's a big difference between I'm doing something like it's a personal project, it scales right through to I'm running some like corporate app with like hundreds of thousands of users or whatever. You're going to want very different infrastructure in different cases then. >> Yes. Yes. Exactly. So, so yeah, if you're looking to have a persistent app, like easiest way to get started is probably Amazon Elastic Container Service. If you're looking for a serverless app, Lambda is just the only way to go. So >> okay all right uh oh okay so the question on terminology here >> ergonomic uh design do you want to talk us through what what that means in this context so not ergonomic I just mean like >> walk around >> yeah sorry like I mean like low learning curve and simple like and and usually not running in not bumping into lots of like rough edges that make a tool hard to use. So like er I would say ergonomic is the opposite of hard to use. >> Okay. Easy to use. I like that. That's seems like a very useful property of like any software. I think something something more designers should take take notice of I think. >> Yes, please. Yeah. >> Uh all right. Uh nice. So uh someone only known as RJ uh said, "Do you are you actually really usin

Original Description

Register for this session to get the recording and resources sent to you! https://www.datacamp.com/webinars/deploying-python-on-aws GitHub repo for today's session: https://github.com/mlops-club/gradio-on-lambda Resources (including link to code along GitHub Repo): https://bit.ly/3IHOgSo Eric Riddoch, a Director of Machine Learning Platform at Pattern, will guide you through the full lifecycle of deploying a Python-based machine learning model on AWS. You’ll walk through best practices for model packaging, deployment, and monitoring—plus learn how to build workflows that hold up under pressure.

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from DataCamp · DataCamp · 0 of 60

← Previous Next →

SQL Server Tutorial: Date manipulation

SQL Server Tutorial: Date manipulation

R Tutorial: Intermediate Interactive Data Visualization with plotly in R

R Tutorial: Intermediate Interactive Data Visualization with plotly in R

R Tutorial: Adding aesthetics to represent a variable

R Tutorial: Adding aesthetics to represent a variable

R Tutorial: Moving Beyond Simple Interactivity

R Tutorial: Moving Beyond Simple Interactivity

Python Tutorial: Why use ML for marketing? Strategies and use cases

Python Tutorial: Why use ML for marketing? Strategies and use cases

Python Tutorial: Preparation for modeling

Python Tutorial: Preparation for modeling

Python Tutorial: Machine Learning modeling steps

Python Tutorial: Machine Learning modeling steps

R Tutorial: The prior model

R Tutorial: The prior model

R Tutorial: Data & the likelihood

R Tutorial: Data & the likelihood

R Tutorial: The posterior model

R Tutorial: The posterior model

R Tutorial: An Introduction to plotly

R Tutorial: An Introduction to plotly

R Tutorial: Plotting a single variable

R Tutorial: Plotting a single variable

R Tutorial: Bivariate graphics

R Tutorial: Bivariate graphics

Python Tutorial: Customer Segmentation in Python

Python Tutorial: Customer Segmentation in Python

Python Tutorial: Time cohorts

Python Tutorial: Time cohorts

Python Tutorial: Calculate cohort metrics

Python Tutorial: Calculate cohort metrics

Python Tutorial: Cohort analysis visualization

Python Tutorial: Cohort analysis visualization

R Tutorial: Building Dashboards with flexdashboard

R Tutorial: Building Dashboards with flexdashboard

R Tutorial: Anatomy of a flexdashboard

R Tutorial: Anatomy of a flexdashboard

R Tutorial: Layout basics

R Tutorial: Layout basics

R Tutorial: Advanced layouts

R Tutorial: Advanced layouts

Python Tutorial: Time Series Analysis in Python

Python Tutorial: Time Series Analysis in Python

Python Tutorial: Correlation of Two Time Series

Python Tutorial: Correlation of Two Time Series

Python Tutorial: Simple Linear Regressions

Python Tutorial: Simple Linear Regressions

Python Tutorial: Autocorrelation

Python Tutorial: Autocorrelation

R Tutorial: The gapminder dataset

R Tutorial: The gapminder dataset

R Tutorial: The filter verb

R Tutorial: The filter verb

R Tutorial: The arrange verb

R Tutorial: The arrange verb

R Tutorial: The mutate verb

R Tutorial: The mutate verb

R Tutorial: What is cluster analysis?

R Tutorial: What is cluster analysis?

R Tutorial: Distance between two observations

R Tutorial: Distance between two observations

R Tutorial: The importance of scale

R Tutorial: The importance of scale

R Tutorial: Measuring distance for categorical data

R Tutorial: Measuring distance for categorical data

Python Tutorial: Plotting multiple graphs

Python Tutorial: Plotting multiple graphs

Python Tutorial: Customizing axes

Python Tutorial: Customizing axes

Python Tutorial: Legends, annotations, & styles

Python Tutorial: Legends, annotations, & styles

Python Tutorial: Introduction to iterators

Python Tutorial: Introduction to iterators

Python Tutorial: Playing with iterators

Python Tutorial: Playing with iterators

Python Tutorial: Using iterators to load large files into memory

Python Tutorial: Using iterators to load large files into memory

SQL Tutorial: Introduction to Relational Databases in SQL

SQL Tutorial: Introduction to Relational Databases in SQL

SQL Tutorial: Tables: At the core of every database

SQL Tutorial: Tables: At the core of every database

SQL Tutorial: Update your database as the structure changes

SQL Tutorial: Update your database as the structure changes

Python Tutorial: Classification-Tree Learning

Python Tutorial: Classification-Tree Learning

Python Tutorial: Decision-Tree for Classification

Python Tutorial: Decision-Tree for Classification

Python Tutorial: Decision-Tree for Regression

Python Tutorial: Decision-Tree for Regression

Python Tutorial: Census Subject Tables

Python Tutorial: Census Subject Tables

Python Tutorial: Census Geography

Python Tutorial: Census Geography

Python Tutorial: Using the Census API

Python Tutorial: Using the Census API

R Tutorial: A/B Testing in R

R Tutorial: A/B Testing in R

R Tutorial: Baseline Conversion Rates

R Tutorial: Baseline Conversion Rates

R Tutorial: Designing an Experiment - Power Analysis

R Tutorial: Designing an Experiment - Power Analysis

R Tutorial: Introduction to qualitative data

R Tutorial: Introduction to qualitative data

R Tutorial: Understanding your qualitative variables

R Tutorial: Understanding your qualitative variables

R Tutorial: Making Better Plots

R Tutorial: Making Better Plots

SQL Tutorial: OLTP and OLAP

SQL Tutorial: OLTP and OLAP

SQL Tutorial: Storing data

SQL Tutorial: Storing data

SQL Tutorial: Database design

SQL Tutorial: Database design

Python Tutorial: Introduction to spaCy

Python Tutorial: Introduction to spaCy

Python Tutorial: Statistical Models

Python Tutorial: Statistical Models

Python Tutorial: Rule-based Matching

Python Tutorial: Rule-based Matching

This video teaches how to deploy Python on AWS using serverless computing with AWS Lambda and Gradio for machine learning app development. It covers the use of AWS Bedrock, Light LLM SDK, and CDK for infrastructure as code. By the end of this video, you will be able to deploy Python applications on AWS and create serverless functions with AWS Lambda.

Key Takeaways

Create a UI with Streamlit and Gradio
Deploy a Streamlit app on AWS Lambda
Test the app on AWS Lambda
Use AWS CDK for infrastructure as code
Deploy Docker images to AWS Lambda
Use Light LLM SDK for LLM integration

💡 Serverless computing with AWS Lambda allows for cost-effective and scalable deployment of Python applications on AWS.

🔒 Pro feature: Ask AI to explain this lesson →

More on: ML Pipelines

View skill →

Building a Dog Breed Identifier App from scratch - DogNet

Building a Dog Breed Identifier App from scratch - DogNet

Aladdin Persson

Complete Dockers For Data Science Tutorial In One Shot

Complete Dockers For Data Science Tutorial In One Shot

Part 6 | Deploy ML Model on Kubernetes | Auto-Scaling with HPA and Monitoring with Prometheus

Part 6 | Deploy ML Model on Kubernetes | Auto-Scaling with HPA and Monitoring with Prometheus

Abonia Sojasingarayar

Vertex Pipelines: Qwik Start

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Automate R scripts with GitHub Actions: Deploy a model

Related AI Lessons

Docker Explained: From “What Even Is This” to Deploying a Full-Stack App

Learn Docker fundamentals and deploy a full-stack app with this beginner-to-advanced guide

Medium · DevOps

I Used to Pay for Cloud Servers. Then I Found a Way to Run One Free, 24/7

Learn how to run a cloud server for free, 24/7, and overcome hosting cost limitations for automation ideas

KEDA 2026: Event-Driven Autoscaling Patterns That Shrank Our AWS Bill by 40%

Learn how to apply event-driven autoscaling patterns using KEDA to reduce cloud costs by 40%

Medium · DevOps

AWS CloudFormation and CDK Explained: Infrastructure as Code on AWS

Learn how to use AWS CloudFormation and CDK for Infrastructure as Code on AWS to streamline your deployment process

Medium · DevOps

Containers on Amazon ECS with Mama J