Building Live Production Systems with RAG (LLMs & RAG: An Interactive Guided Tour Part 4)

Outerbounds · Beginner ·🔍 RAG & Vector Search ·2y ago

Key Takeaways

This video series provides an interactive guided tour to LLMs, RAG, and Fine-Tuning, focusing on building live production systems with RAG and exploring methods for working with LLMs, including fine-tuning and vector databases.

Full Transcript

totally totally um that's very much in the direction of now the next run of slides we have here which is about moving this system that I just kind of unrolled in a terminal in jupyter notebook or whatever now like how do we actually put this into a production system right it's like we need um to think about exactly to Hugo's Point like imagine now like I made this rag Pipeline and I pushed the whole thing into production and now the a new feature lands in metal flows documentation like the at piie decorator H how do I know like when when is the data that is the documentation of that thing like how is when is it actually going to be in this rag pipeline becomes a super important question in practice then when it's like we're trying to solve around this issue of like chat GPT doesn't know the context that we're about talking about metaflow it doesn't know the context past whatever cut off open a training data is like how do we actually like build systems to continuously be refreshing something like the pine cone index or the Lance DB index or whatever tool you use to absolutely and I think you've motivated well here Eddie in terms of you know you quote unquote wanting your flow to know when this stuff has been done but even think in some organizations the different teams may be responsible and a lot of time will be responsible for different parts of what's going on here right so your data pre-processing flow may be owned by a data engineering team and the model Flow by a data science or machine learning team so and you don't you know want to be messaging people on slack every time the pre-pressing you know slack is where work happens but apis are where work happens as well right and event triggering and event buses are where work really the some of the exciting work happens so perhaps what we're moving towards is figuring out how a data pre-processing flow can actually just trigger a model flow which then could trigger an inference flow right exactly exactly and maybe as you're you're doing your data pre-processing around like curating your your rag pipeline maybe something goes wrong and then you want to use a slack API to alert Hugo like hey man something broke what what do we do about this and then we like have a little debugging session to get together or something like this like it's it's important to start thinking about like let's get out of the Jupiter notebook mindset for for one second and kind of zoom out and think about if we want to run this stuff in these workflows what sort of hooks do we need in that system and how do we how do we afford them um so I guess that really the argument of this section here is metaflow is a nice tool for that we're going to show you how to do it with the same code that we just in our Jupiter notebook so metaflow has these a lot of these features we've been discussing um but it also has this notion of being like an orchestrator that you can either schedule jobs and run them on a calendar or you can trigger them based on events happening or and like like in the story Hugo is describing based on other workflows maybe from a different team pro finishes data pre-processing workflow that runs every night at 12 pm or something or that would be the afternoon every afternoon at 12m and then whenever that workflow finishes we train a new model or we do something do do whatever downam and these are all really nice examples Eddie and I do as as we're kind of moving towards I just do want to clear like the abstraction layer we want to think about these things are events right when something something relevant happens that's an event right such as data being updated S3 something in an S3 bucket changing a workflow finishing whatever it is and whatever you think is an important event exactly now connecting this eventt triggered World to rag or maybe scheduled workflow world to rag um one of the ways that we wrote about this in the blog post that we mentioned earlier is like how how can you visualize this entire system of happening like basically we've already went through kind of the dark gray box on top here in the last notebook so it's basically this idea of like we have this online user who sends in a query like the thing about like the vague question about metaflow that thing needs to get in embedded in the same way that all the other things in the vector store got embedded then we do this matching process gets something that we consider kind of like factual data out of that matching process and then use it to make the llm generate things that are much more tightly scoped to our actual documentation or whatever Source we we consider a source of Truth Now like how do we actually populate that index is really the question that we're asking when we get into the workflow realm and what we're going to show you how to do in the metlow sandbox um which this is basically taking the code that just wrote to like Chunk Up The Metal Flow documentation and then embed it into pine cone and then we're going to run that every night and we're basically going to chain the different parts of the workflow into three separate workflows that are all like triggering each other essentially as an end end chain super cool man um specifically we have three sections we're gonna have this markdown chunker which is named I just like the word chunker yeah this thing is going to run on a schedule maybe every night you could change it to hourly run it as the Cron job whatever um when this thing finishes we're going to do the the pre-processing and kind of log a card is the metaflow terminology which shows a data visualization of what happened and then when that thing finishes um it's going to trigger the pine cone indexing part so like one reason why you might want to think about separating stuff like this into like either different steps in your workflow or like we see here into just three independent workflows um is that the first two are like relatively low resource intensive maybe they're just things that we want to like can do like massively in parallel on very cheap computers um but this third one where we're actually going to compute embeddings and then put it into the pine cone Vector indexer that could be very compute intensive like if we had four million documents the story we were talking about earlier maybe we need 10 machines with gpus on them or something in parallel and we want to like quickly be able to turn that on so it's like useful to kind of separate that part of the workflow into its own into its own like subsystem um sorry maybe it's just worth adding I mean Eddie has just really talked about the three different steps um that we want to trigger one another and why we've chosen partic why those three steps um with very reasonable choices there um these are design choices that you want to make yourself so you might have a whole series of steps in involved and maybe WID choose to do it in a different way to you but depending on your organizational concerns how your teams work business concerns compute concerns you might you know PA this Chun chunk this different differently so about chunking no that's a yeah it's uh it's really like up up to you as a system designer is the idea like metlow is not telling you what to do it's just giving you Primitives to build whatever system you need to build exactly um so like what what are these things actually though like I spoke a lot at a high level on the last slide like what I mean by you can schedule the markdown chunker workflow this is a simple decorator in metaflow when I say that I want to run it every week um when I say that these are like relatively low resource I'm just saying like these are 8 8,000 megabyte machines that's all I really care about um I want you to run it from this Docker image where I've already like packaged up all my dependencies um so before Hugo was kind of talking about cond and the the PIP um decorator from metaflow you can also use Docker is kind of what this what this is showing here so metaflow has like very complete coverage of the different dependency Management Solutions and Eddie just to stop you for one second I've been so excited about all the llm stuff we've been doing I actually forget if we've really introduced the basics of writing a metaflow flow um yeah yeah maybe I'll just say a few words that it's you you use a class definition it's object oriented programming we have templates for for this type of stuff if you're not so comfortable with that you use at step to Define each step um we don't really mind what like we want you to do whatever steps you want essentially we do enforce metaflow does enforce a start step and an end step but you can do anything in between including like big Branch outs and massive parallelization as long as you join correctly of course so you're building a dag here um it's all pythonic um instead of writing yaml and this type of stuff we want you to be able to use decorators such as at kubernetes and we can see where um specifying the containerization story here the docker you can use at pyi you can use at as well specifying the amount of resources whole variet of decorators which allow you to specify different things um as part of the productionize really need to use here is an at schedule schedule weekly equals true decorator here as well um and I presume what we're going to see is how you deploy that using Argo or something along those lines on the command line which is a relatively simple command the reason I wanted to make this clear is that metaflow wants you to write your python code um the way you like to write it um and all metaflow enforces is some basic design principles around how to write um a dag directed A C gra a Clic graph or machine learning uh workflow there we've got in our onboarding tutorials on the sandbox you can check all of that stuff as well that was just kind of a whirlwind two-minute introduction to what's What's Happening Here For Those interested yeah perfect um yeah just just to like piggyback on one thing Hugo is saying that I think is very important here is like we just see the outline of a metaflow workflow here all the user code like your research code in your business logic is these dot dot dots inside of the step the start step and the end step this is where you put maybe you call a script that you've written and would run independent of metaflow otherwise or maybe you just put the code directly here because you can run arbitrary python code point is is that the code that's inside the start step getting lifted up to run in kubernetes with 8,000 GB Ram inside the kubernetes Pod these kinds of things so cool cool so the second flow just another example of what this stuff looks like so we can connect conect to our markdown chunker workflow that one that we're going to deploy in a second that we just were looking at the the actual python code of we can connect this workflow to now run whenever that one finishes using this at trigger on finish decorator and then we we see like maybe another metlow use case here that we have this notion of parameters so you can like pass different values in from that workflow maybe something happens in the markdown workflow that is information that we specifically want to pass into this one things like this again with the Beret decorator a new decorator here called at card um this is a way to kind of do some some experiment tracking to visualize data in Mt plot lib charts or or plot with plotly or alter whatever your favorite framework is um many many other things you can do too like you can make kind of different tables um represent different widgets inside of uh whatever sort of like experiment tracking dashboard you want to compose really um and then the final one that we're going to see is um the pine cone Vector indexer workflow this one's also trigger onfinish with the data table processor as we discussed um and then just kind of furthering the dependency management story we're using another Docker image but this time a much metor Docker image because this one needs to have GPU dependencies and it needs to have all the Transformers Machinery built into it um so we might request an instance with more memory on it more more CPUs and or GPU um and then we can easily just swap in and out Docker images like this is a big superpower of metap flow cool um so now like we're kind of getting into like like a little bit more the conceptual layer here and I think it'd be a good time to just like actually run some code and get all right so I am going to expand this thing again and when we open up the uh the sandbox and go to lesson five we'll see the full workflow with the code actually written in and maybe I'll just add one word about Argo Argo um is a way that we schedule um or deploy machine learning workflows if you've used something like AWS step functions um maybe that's a point of comparison here but essentially what Argo allows us to do and maybe you can add a bit more color to this Eddie is we in terms of productionizing uh machine learning models we want to make sure that when I close my laptop at the end of the day and go and have dinner with my family that I um that we have um it running somewhere external to to my laptop and that's something that Argo allows us to do right Eddie yeah that's right that's right here go so I'm g go ahead and open up the terminal here and now this is going to be I'm actually just going to type the command out these buttons are kind of like automating the process but I think it's more illustrative to actually just type it out so we have the flow. pi inside python then the name of the flow and the path to the flow right which this is just a python script so that's the name of the script that we see visualized here um and then to actually deploy this workflow you type Argo workflows and create make this terminal a little bit bigger here so you see it looks somewhat similar to when we just run the workflow directly deploying to Argo workflows then we get a bunch of interesting information from metaflow which I won't really go too deep into detail but like there's a whole Machinery around name spaces and production branches and things that afford like AB testing multiarm Bandit testing these sorts of things in your actual production system so you can have side by side deployments is really the upshot there cool now let's keep moving uh the second one that we want to deploy just make this a bit smaller is is the same exact pattern so looking through this we see a bunch of like M Seaborn map plot lib code see a bunch more parameters for our data processing stuff um then the the thing that we're going to look at in a little bit is this at card so let's just go ahead and get this stuff running so we have Argo workflows and then create again looks good oops wait we're creating another flow now sorry is that yeah we're deploying the second flow so we deployed that first markdown chunker one and now this one we're deploying is the data processing one but weren't we going to have one trigger the other or exactly yeah so this this this second one is has the at trigger on finish decorator right so but so deploying it isn't running it is what I'm hearing right correct so when I say deploy I'm I'm running this create command now if we wanted to just trigger this thing that's already on Argo we could just replace create with trigger and that would do what you're suggesting yep cool um now let's deploy the third one and hopefully if the demo gods are kind to us today we'll just get this like end to endend all three of them running after just running one trigger command is the idea um now there's one thing that I left blank up here because I didn't want to put it in a GitHub repository um we'll we'll again do the super hat just pasting my pine cone key um but of course actually I should show you this instead of just saying it metaflow has this feature called secrets that if you're cringing at the fact that I'm doing this you should check this out because this this is the real way in a production system that you would you approach yeah and I'm not cringing I'm totally cool if you do it we just got to tell people about secrets and environment variables and also you got to delete that key afterwards well I just get nervous that I'm suggesting like AB terrible the thing that's making me most nervous now is your battery level dude oh yeah well it's gonna be a buzzer beater we we'll we might have early on the fine tuny part it's like the episode of Seinfeld where Kramer's like going below zero on the on on the gas tank yeah dude this is this is my way um okay so our last Argo workflows creates in section six the pine cone indexer okay and are we going to look at the Argo UI or something like that as well or yeah so let's Okay I'm going to click I'm gonna I'm GNA get the workflow running and then we'll click into the the UI well actually I just thought it may be interesting to see that it's created but not deployed yet oh sure good idea good idea so actually just open up the metaflow UI which is connected to Argo and soon by the way this metaflow UI will be a complete super set of the features in Argo UI so just point of order um but it's not in the current sandbox edition of the metaflow UI so at to Hugo's point we don't see any workflows running in Argo right now but we should see them created yeah exactly these things that Argo calls workflow templates so this is kind of like what metaflow is like kind of sort of compiling the dag that we wrote in our python code into argo's language of these workflow templates which we can click into them and then we can specifically see what I'm talking about like these are for people familiar with Argo these are like the familiar yaml monstrosities that you would run into on me um Okay cool so again what we want to do trigger this markdown chunker thing that should automatically trigger the data processor should automatically trigger our pine cone indexer let's do it so now in order to do this we have basically the same command that we ran before so let's find our section four command and all we should need to do is take take away creates and right trigger incredible and we'll see right okay so we get this helpful print out to the metaflow UI um but to hoo's point we want to let's just look in Argo because it'll be a little bit more responsive right now um so we see that this workflows triggered M kind of like watch What's happen be the dag of the workflow so that's the start step exactly exactly very cool well this while we're waiting on this Hugo because it might take a few minutes there a bit of a a product plug but for anybody who's still watching this video I feel totally okay doing that um have you seen the visualizations of these dags on the outer bounds platform Rec recently all the work that's been going on there on the front end side of things not recently do you want to show me oh man super exciting um I don't think I have a great example loaded up maybe we should do this we should actually do a full video on that sometime I think like compar Argo because like there's a bunch of cool stuff that's now being like consumed into the metaflow UI that is like equivalent to what we're looking at now um well watch this space everyone yes watch this space watch this space consider outer bonds platform so the other thing is we can I mean we've got a few more slides to show and then we're going to talk about I wonder whether just coming back to make sure that this is executed correctly after going through the rest of the workshop would be the way to go yeah we can definitely to go ahead and do that um do you want to look at the fine-tuning code right away or should we go back to the slides do you think uh um I'll take your leate Eddie

Original Description

This is a 6 video series interactive guided tour to LLMs, RAG, & Fine-Tuning. The next part is here: https://youtu.be/DLnY5nZKTDM The playlist is here: https://youtube.com/playlist?list=PLUsOvkBBnJBcZglk6QQyKGZsgEzClGnv-&si=66stnfv3-HXa60m9 You can also watch the full workshop here: https://youtu.be/uDBGwQ7JAzQ In this workshop, attendees will learn about methods for working with LLMs. Our stories will be guided by examples you can run on your laptop or in a (free) hosted cloud environment provided to attendees. Developers will expand their awareness of how researchers and product designers are working with LLMs, with emphasis on connecting high-level concepts such as fine-tuning and vector databases to the fundamental math and APIs data scientists should understand. Business-minded executives can either get hands-on or follow the higher-level stories to deepen their sense of what is possible with LLMs, the technicalities behind risks they introduce, and how they fit into the arc of ML. The primary value of this workshop will be as a guide to help teams set reasonable goals in the complex and fast-moving world of LLMs, and understand what you need to successfully support your team’s next LLM projects. What You’ll Learn: There are cheap (e.g., APIs) and expensive (e.g., fine-tuning, training) ways to build on top of LLMs. The methods you choose have consequences in apps you can build and how your dev team works. We will learn how to think about these choices as we develop basic apps you can use as templates for future genAI projects. Learners have the option to follow along in a provided dev environment where we will unpack these choices and make the tradeoffs and decision space concrete. The Github repository is here: https://github.com/outerbounds/generative-ai-summit-austin-2023
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Playlist UU5h8Ji6Lm1RyAZopnCpDq7Q · Outerbounds · 54 of 60

1 Metaflow GUI for monitoring machine learning workflows
Metaflow GUI for monitoring machine learning workflows
Outerbounds
2 Metaflow Cards [no sound]
Metaflow Cards [no sound]
Outerbounds
3 Fireside chat #1: How to Produce Sustainable Business Value with Machine Learning
Fireside chat #1: How to Produce Sustainable Business Value with Machine Learning
Outerbounds
4 Fireside chat #2: MadeWithML.com -- Teaching Practical Machine Learning
Fireside chat #2: MadeWithML.com -- Teaching Practical Machine Learning
Outerbounds
5 Metaflow on Kubernetes and Argo Workflows [no sound]
Metaflow on Kubernetes and Argo Workflows [no sound]
Outerbounds
6 Fireside chat #3: Reasonable Scale Machine Learning -- You're not Google and it's totally OK
Fireside chat #3: Reasonable Scale Machine Learning -- You're not Google and it's totally OK
Outerbounds
7 Metaflow Tags: Programmatic Tagging
Metaflow Tags: Programmatic Tagging
Outerbounds
8 Metaflow Tags: Basic Tagging
Metaflow Tags: Basic Tagging
Outerbounds
9 Metaflow Tags: Tags in CI/CD
Metaflow Tags: Tags in CI/CD
Outerbounds
10 Metaflow Tags: Tags and Namespaces
Metaflow Tags: Tags and Namespaces
Outerbounds
11 Metaflow Tags: Tags and Continuous Training
Metaflow Tags: Tags and Continuous Training
Outerbounds
12 Fireside chat #4: Machine Learning and User Experience -- Building ML Products for People
Fireside chat #4: Machine Learning and User Experience -- Building ML Products for People
Outerbounds
13 Fireside Chat #5: Machine Learning + Infrastructure for Humans
Fireside Chat #5: Machine Learning + Infrastructure for Humans
Outerbounds
14 Metaflow Sandbox Demo: Free Data Science Infrastructure In the Browser
Metaflow Sandbox Demo: Free Data Science Infrastructure In the Browser
Outerbounds
15 Metaflow on Azure
Metaflow on Azure
Outerbounds
16 Fireside Chat #6: Operationalizing ML -- Patterns and Pain Points from MLOps Practitioners
Fireside Chat #6: Operationalizing ML -- Patterns and Pain Points from MLOps Practitioners
Outerbounds
17 ML engineering vs traditional software engineering: similarities and differences
ML engineering vs traditional software engineering: similarities and differences
Outerbounds
18 Why data scientists love and hate notebooks: velocity and validation
Why data scientists love and hate notebooks: velocity and validation
Outerbounds
19 What even is a 10x ML engineer?
What even is a 10x ML engineer?
Outerbounds
20 The 4 main tasks in the production ML lifecycle
The 4 main tasks in the production ML lifecycle
Outerbounds
21 Is the premise of data-centric AI flawed?
Is the premise of data-centric AI flawed?
Outerbounds
22 The 3 factors that Determine the success of ML projects
The 3 factors that Determine the success of ML projects
Outerbounds
23 Fireside Chat #7: How to Build an Enterprise Machine Learning Platform from Scratch
Fireside Chat #7: How to Build an Enterprise Machine Learning Platform from Scratch
Outerbounds
24 Run Metaflow on any cloud: Google Cloud, Azure, or AWS [no sound]
Run Metaflow on any cloud: Google Cloud, Azure, or AWS [no sound]
Outerbounds
25 Metaflow on GCP
Metaflow on GCP
Outerbounds
26 Fireside Chat #8: Navigating the Full Stack of Machine Learning
Fireside Chat #8: Navigating the Full Stack of Machine Learning
Outerbounds
27 How to Build a Full-Stack Recommender System
How to Build a Full-Stack Recommender System
Outerbounds
28 Modernize your Airflow deployments with Metaflow - zero-cost migration [no sound]
Modernize your Airflow deployments with Metaflow - zero-cost migration [no sound]
Outerbounds
29 Easy Airflow DAGs for ML and data science with Metaflow [no sound]
Easy Airflow DAGs for ML and data science with Metaflow [no sound]
Outerbounds
30 Fireside chat #9:  Language Processing: From Prototype to Production
Fireside chat #9: Language Processing: From Prototype to Production
Outerbounds
31 How to build end-to-end recommender systems at reasonable scale
How to build end-to-end recommender systems at reasonable scale
Outerbounds
32 Full-Stack Machine Learning with Metaflow on CoRise
Full-Stack Machine Learning with Metaflow on CoRise
Outerbounds
33 Natural Language Processing meets MLOps
Natural Language Processing meets MLOps
Outerbounds
34 Fireside Chat #10: Large Language Models: Beyond Proofs of Concept
Fireside Chat #10: Large Language Models: Beyond Proofs of Concept
Outerbounds
35 What even are Large Language Models?
What even are Large Language Models?
Outerbounds
36 How to get started with LLMs today
How to get started with LLMs today
Outerbounds
37 LLMs in production
LLMs in production
Outerbounds
38 Accessing secrets securely in Metaflow [no audio]
Accessing secrets securely in Metaflow [no audio]
Outerbounds
39 Fireside Chat #11: The Open-Source Modern Data Stack
Fireside Chat #11: The Open-Source Modern Data Stack
Outerbounds
40 Fireside chat #12: Kubernetes for Data Scientists
Fireside chat #12: Kubernetes for Data Scientists
Outerbounds
41 Behind the Screen: How Amazon Prime Video ships RecSys models 4x faster
Behind the Screen: How Amazon Prime Video ships RecSys models 4x faster
Outerbounds
42 Fireside chat #13: Supply Chain Security in Machine Learning
Fireside chat #13: Supply Chain Security in Machine Learning
Outerbounds
43 Quick Delivery, Quicker ML: DeliveryHero's Metaflow Story
Quick Delivery, Quicker ML: DeliveryHero's Metaflow Story
Outerbounds
44 Crafting General Intelligence: LLM Fine-tuning with Metaflow at Adept.ai
Crafting General Intelligence: LLM Fine-tuning with Metaflow at Adept.ai
Outerbounds
45 Fuelling Decisions: How DTN Powers Gas Pricing and Data Science Collaboration
Fuelling Decisions: How DTN Powers Gas Pricing and Data Science Collaboration
Outerbounds
46 From Kitchen to Doorstep: Optimizing Data Science Velocity at Deliveroo
From Kitchen to Doorstep: Optimizing Data Science Velocity at Deliveroo
Outerbounds
47 Building a GenAI Ready ML Platform with Metaflow at Autodesk
Building a GenAI Ready ML Platform with Metaflow at Autodesk
Outerbounds
48 Media Transcoding for 10 Million users and beyond with Metaflow at Epignosis
Media Transcoding for 10 Million users and beyond with Metaflow at Epignosis
Outerbounds
49 Telematics with Metaflow: How Nirvana Insurance built a large-scale Risk Estimation platform
Telematics with Metaflow: How Nirvana Insurance built a large-scale Risk Estimation platform
Outerbounds
50 Fireside chat #14: Generative AI and Machine Learning for Film, TV, and Gaming
Fireside chat #14: Generative AI and Machine Learning for Film, TV, and Gaming
Outerbounds
51 The Past, Present, and Future of Generative AI
The Past, Present, and Future of Generative AI
Outerbounds
52 Building Production Systems with Generative AI, Machine Learning, and Data
Building Production Systems with Generative AI, Machine Learning, and Data
Outerbounds
53 A Custom Fine-Tuned LLM in Action (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 5)
A Custom Fine-Tuned LLM in Action (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 5)
Outerbounds
Building Live Production Systems with RAG (LLMs & RAG: An Interactive Guided Tour Part 4)
Building Live Production Systems with RAG (LLMs & RAG: An Interactive Guided Tour Part 4)
Outerbounds
55 Better Relevancy with RAG (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 3)
Better Relevancy with RAG (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 3)
Outerbounds
56 Working with OSS LLMs (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 2)
Working with OSS LLMs (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 2)
Outerbounds
57 Hitting OpenAI and Other Vendor APIs (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 1)
Hitting OpenAI and Other Vendor APIs (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 1)
Outerbounds
58 Production Systems with Generative AI (LLMs, RAG, & Fine-Tuning: An Interactive Guided Tour Part 0)
Production Systems with Generative AI (LLMs, RAG, & Fine-Tuning: An Interactive Guided Tour Part 0)
Outerbounds
59 LLMs in Practice: A Guide to Recent Trends and Techniques
LLMs in Practice: A Guide to Recent Trends and Techniques
Outerbounds
60 Metaflow for distributed high-performance computing and large-scale AI training
Metaflow for distributed high-performance computing and large-scale AI training
Outerbounds

This video series provides a comprehensive introduction to building live production systems with RAG and explores methods for working with LLMs, including fine-tuning and vector databases. Learners will gain hands-on experience with LLMs and RAG, and learn how to make informed decisions about building on top of these technologies.

Key Takeaways
  1. Set up a dev environment for LLMs and RAG
  2. Explore fine-tuning methods for LLMs
  3. Implement RAG search in a production system
  4. Optimize RAG for specific use cases
  5. Integrate LLMs with other AI models
💡 The choice of method for building on top of LLMs has significant consequences for the apps that can be built and how the dev team works, and learners should consider these tradeoffs when making decisions about LLM-based projects.

Related AI Lessons

Why you shouldn’t search your documents directly with AI
Learn why directly searching documents with AI can be inefficient and how retrieval-augmented systems can improve the process
Medium · Programming
Your AI Keeps Making Things Up. RAG Is How You Make It Use Real Facts Instead.
Learn how to use RAG to make your AI provide accurate answers based on real facts instead of making things up
Medium · RAG
Evaluation Metrics for RAG: Measure Retrieval, Generation, and End-to-End Quality With Numbers That…
Learn to evaluate RAG models using metrics that measure retrieval, generation, and end-to-end quality
Medium · AI
Evaluation Metrics for RAG: Measure Retrieval, Generation, and End-to-End Quality With Numbers That…
Learn to evaluate RAG models using metrics that measure retrieval, generation, and end-to-end quality
Medium · Data Science
Up next
RRF vs DBSF with Qdrant: Hybrid Retrieval Fusion for RAG in Python
Professor Py: AI Engineering
Watch →