No Priors Ep. 52 | With Pinecone CEO Edo Liberty

No Priors: AI, Machine Learning, Tech, & Startups · Beginner ·🔍 RAG & Vector Search ·2y ago

Skills: RAG Basics90%Vector Stores80%RAG Evaluation70%Advanced RAG60%Fine-tuning LLMs50%

Key Takeaways

The video discusses the use of RAG architecture and vector databases, such as Pinecone, to improve syntax search and make LLMs more available, with applications in retrieval augmented generation, fine-tuning, and semantic search.

Full Transcript

hi listeners and welcome to another episode of no priors today alad and I are talking with Ido Liberty the founder and CEO of pine cone a vector database company designed to power AI applications by providing long-term memory before pine cone EO was the director of research at AWS AI labs and also previously at Yahoo we're excited to talk about the increasingly popular rag architecture and how to make LMS more reliable welcome EA hi okay let's start with uh some basic background can you tell us more about pine cone for for listeners who haven't heard of it like what does it do and how does it differ from other databases so pine cone is a vector database and what Vector databases do very differently is that they deal with data that um has been analyzed and vectorized I'll explain a second what that means by Machine learning models by large Lang language models by foundational models and so on the most large language models or Foundation models actually any models really understand data in a numeric way models are mathematical objects right and when they read a document or a paragraph or an image they don't save the pixels or the words they save a numeric representation called an embedding or a vector and that is the object that is manipulated stored retrieved and searched over and and and uh operated on by Vector databases very efficiently at Large Scale um and that is pine cone when we started that uh category people called me concerned and said uh what is the vector and why are you starting a database and now uh I think they know the answer how did you think about this early on because you started the company in 2019 at the time there's wave of generative AI hadn't happened quite yet and so I was wondering what applications you had in in mind given that there's so much excitement around pine cone for the AI world the prior AI world had a slightly different approach to a variety of these things and I'm just curious like were you thinking of different types of embeddings back then were you thinking of other use cases like what was the original thinking in terms of starting pine cone the tsunami wave of AI that we're going through right now uh didn't hit yet but the in 2019 the earthquake had already happened deep learning models and and so on have already been grappled with large language models and Transformer models like Bert and others started being used by the more mainstream engineering cohorts you could already kind of connect the dots and see that where this is going in fact before starting Pine con I myself had found their anxiety between are we already too late versus nobody knows what the hell this is and way too early and it took me several months of like wild swings between those two things until I figured maybe the fact that I have those too early too late mood swings maybe means it's exactly the right time maybe you know you can actually just expand a little bit about um you know in what what use cases people want to use embeddings right I I think there are ways to interact directly with language models and then reasons for for example reliability or con context length like that um people and performance that people um interact with with embeddings in like a rag architecture or in semantic search so maybe you can sort of talk about some of the driving use cases I mean the obvious way in some sense to add knowledge to your conversational agent whether it's chat or what have you gener we talk about it is generative AI now but it's it's much more General than that is to again not shockingly bring the relevant information into the context right so that you can actually uh arm the the foundational model with with the right piece of P of content with text with images with what have you right you want to be able to retrieve that from a very large Corpus of knowledge that you have whether it's your own company's data or whether it's the internet or what have you it so happens that llms are already very very good at representing data in the way that they want to consume it which is these embeddings and so you can add question time in real time okay at the time of the interaction go and find relevant information and relevant might be associated with or correlated with or something that is is uh similar to whatever it is that you're being asked about and once you bring that into the context you can now uh give uh much more accurate accurate uh answers right um as an you know as a side experiment we actually loaded what called common craw which is the top internet Pages crawled fairly frequently we load that into pine cone and saw what happens when you augment GPT 3.5 and four and llama and mix and models from here and thropic and you could see that if you augment all of them with rag on even on the internet which is data that they were trained on you can reduce hallucinations significantly up to 50% sometimes interestingly enough many of them actually start behaving quite similarly in terms of level of accuracy even though without rag they they actually have quite different behaviors so it's sort of both like a an a uniform Improvement and a little bit of leveling the plane field now you know because we know we can do that very well now now you can do that also with proprietary data with comp data inside your company and so on stuff that of course is not available on the internet and stuff that those models were never trained on and interestingly enough again the quality ends up being incredibly High I assume most pine cone users are not you know using LMS and retrieving against general internet data like what kinds of companies were your earliest or or biggest users like what kind of data do they want to retrieve against so most companies do use their own company data um it could be whatever it is depends on the application they're building could be legal data medical records internal wikii uh information sales calls you name it there there an infinite variety I want to say that this is just rag I mean this is just semantic search I mean there are many other applications that we didn't talk about but we can keep it uh focused on this application for this conversation and is it is it dominated by a specific use case like were there customers that you feel like really represent the pine cone use case well yeah 100% uh first text is probably most of what we see uh nowadays models are really good at images and so on but Tex is still the predominant data type notion Q&A now runs on on Pine Cone and they serve essentially question answering with AI uh to tens of thousands and probably hundreds of thousands of of their own customers uh gong does the same thing with sales calls again serves all of their use cases for all of their customers and so on so one of the most common patterns is companies that themselves become Trailblazers and innovators with Ai and they themselves hold a lot of their own users or customers text and they want to search over it or generate information on top of it uh that ends up being an incredibly common pattern I guess earlier um this month uh one of the things that Pine con announced was the serverless offering called canopy could you tell us a little bit about why you decided to go down the serverless direction and how you view that in terms of either use cases or adoption or other things so canopy is actually an open source course that we put out there as a framework for people to learn how to use rag pine cone serverless is just called pine cone it's just pine cone but serverless uh What uh it does is basically removes the limits from uh what people used to experience before um when we started pine cone a lot of the applications had to do with recommendation uh engines and and anomaly detection and other and other problems where us usually the scale was actually fairly small uh and the requirements had to do with super low latencies and sometimes high throughput and and as a result you still see a lot of databases kind of playing that field uh we very quickly figured out with our own customers and our own experimentation that something else is much more significant which is just scale and cost if you want to be able to answer correctly you just have to know a lot if you want to do that you have to ingest hundreds of millions billions sometimes tens of billions of of vectos into your own into your VTO database and you want to query it efficiently in terms of cost you just don't want you know you don't want that to explode uh in terms of again spend and finally you want to do that easily so you don't want to spend weeks and months setting things up and and getting it to work and doing that in our old architecture and frankly with any other architecture today that's not serverless is is very difficult and serverless is here to basically resolve those main problems it's incredibly easy to operate it scales massively I mean again there's no theoretical limit to how much it can scale we've tested it with tens and tens of billions with live customers in live traffic and I'm not going to go into the architectural design but it's actually designed to be incredibly efficient like asymptotically better than than you know what can be done with any other architecture it's fundamentally about removing the all limits so people can actually have all the information they need uh ready for the foundational models you mentioned canopy is um to help enable more people to build rag products like where do you where do you see developers or your customers struggle to get embedding based AI products generally successful or what were you what were you trying to achieve with with canopy yeah yeah so Vector databases and pine cone specifically are very foundational model are very foundational pieces of Technology we're we're very deep in the stack and to build a you know a proper full end to endend solution say like notion Q&A there's quite a lot that you have to build on top of it you have to injust documents and and what's called chunk them you have to figure out how to break them into like factoids and pieces of information you have to embed everything with models you have to ingest them into the vector the base you know when you get a query you have to figure out how to manipulate it and how to embed that you have to search over it you have to rerank you you know there's there's a lot there's a whole system you have to build around it and a lot of people told us that this is actually quite complex and they're right right we put out canopy as really an example really it's an end to endend kind of cookbook is if you just take this it should work you should probably once it works you should figure out how to make it better for your own application right because you know medical data is not jira tickets you know and jira tickets are not slack messages and you might be building a different product but at least you have some in toin starting point that that already does something and you can start improving on two of I think the most common comparison points for um Vector databases that people use are a like traditional databases right right like why not just use postes PG vector or some index Associated within existing database or um be like sort of more traditional search uh or incumbent search Technologies or services like elastic or alilia can you can you talk about like you know why not other databases or like how you think about traditional search yeah I'll just go back to the fundamentals about what are you trying to achieve right and what we're trying to achieve is to give as much context and as much knowledge to foundational models as possible do that easily at scale uh you know on a budget uh get to a unit economics that actually works for your product which is incredibly hard to do with AI with like many uh uh discussions going on about that now um those other products don't work they don't work either because they don't scale in terms of the uh efficiency uh scale cost the tradeoffs that they can offer because they're not designed to do this they're designed to do something else they kind of thought about Vector index as a as a bolon you know retrofitted feature and so yes it works at small scale but when you try to actually go to production with it you you understand the limitations with other search Technologies this is again this is the wrong search mode if you're searching with keywords and just not finding the relevant information because the embeddings the the contextual space in which these pieces of text documents or images live is inal space in high dimensional numeric space not in keyword space and like everyone that's ever searched their inbox for an email you know for a fact you have uh and not find it knows that keyword search has a deeply uh flawed retrieval system I'm just curious if customers or you know develop ERS are trying to combine the existing search systems they have I know you also are increasingly supporting hybrid search so kind of wanted to understand that where are embeddings like amazing and useful and like delivering new experiences and where they're not enough or not like the the full experience that end users want so it's interesting our research actually shows that when you do this well we you very rarely need keywords alongside embedding but getting embeddings to perform perfectly is is actually it could be quite intricate and we find that it's very convenient to have um keywords alongside embeddings and to score those things together we call this hybrid search and in fact we made this even more General and we said okay why not you know keywords under the hood are actually represented as as sparse vectors uh that's true of any keyword search by the way this is not this is just kind of the mathematically identical and then we said why don't we just make this more General and just say hey you can give either sparse or dense vectal or both of them and kind of have the best of both worlds and people find that very convenient uh and so I'd highly encourage people to look at it and uh improve you know by boosting and all sorts of other tricks that you can bake into spouse vectors including keywords my guess is that that's not going to be the dominant mode of of search in the very near future you think we progress like you think hybrid search is a like more temporary convenience I mean I think it'll be used for boosting and other types of levers to control your search I think the mode of you baking keywords into that is going away yes and uh maybe just uh going back to like the traditional database market like why not in my postres or my manga whatever I'm using already again I mean I we we see this in the market a lot people tell me hey I already use tool X or database Y and why not and frankly often times when it's some tiny workload or just learning how to use embeddings for the first time and so on it might actually work okay it's when people try to actually do something in production they're trying to scale up they're trying to actually push the envelope or they're trying to launch a product that needs to have some unit economics attached to it that makes sense for the for that product that's where people run into huge problem s and so uh many of them just you know start with us to begin with to be honest a lot of them are enthusiasts and they actually kind of enjoy learning how to use a new kind of database and are you you know user experiences smooth enough and and to you know there so many tutorials and notebooks and examples that they actually find it exciting but I guess some don't and that's that's that's fine so maybe one more on database Dynamics pine cone is close Source it's gotten great adoption uh but many databases is and like you know mature Market are open source how do you think about this decision and has that has it been an issue for you I I'll say that most databases started before Cloud was really uh a fully mature product or Market or you know platform okay and so the the that was the precursor to plg essentially or whatever it was plg right it was it's a you know that that was the only way to put a technically complex product at the hands of Engineers was to open source it right and you see I think all I mean maybe not all but definitely the larger databases that are open source out there I think that's the reason they did that when we started pine cone we asked you know the very basic question of why why do people open source the platform right um one of it was to earn trust one of them was to get contribution from the community and one of them was a channel to you know users and we figured we can earn trust by being excellent while we do in providing an amazing service we don't uh need external contribution and in fact if you look at statistics even companies that are open source 99% of the contributions are actually from the company itself not 99 but High 99s and so that doesn't actually make a huge difference and in terms of experience we figure that we can actually provide a much better experience and much better access to the platform than what open SCE does and pine cone is a fully managed and multi-tenant service and to be able to run that at scale and provide the cost trade the cost scale tradeoffs we actually run a very very complicated system and in in some sense even if we gave it as open SES to somebody they wouldn't know what what to do with it it will be a Herculean effort to even run this thing the right decision was basically that we should offer this as a service we should manage it end to endend and as long as you give people a fully reliable interface and you keep doing that year after year you earn the trust and the ease of use uh that open source becomes in I hope not not an issue it's funny because the two two ectors around along those lines um I remember talking with uh I think it was Ali from data brecks and he said that if you can avoid doing open source you should you know he felt like it was an incremental challenge because you get distribution through open source but then you have to figure out the business model and so he viewed it as like you know I think that the analogy he uses is like making an open source project work is like hitting a hole in one in golf and then you pick up a baseball bat and you have to hit a grand slam because then you have to do the second act to make sure the thing actually works as a company that's right no I mean I agree 100% I mean this is exactly what we're experiencing and in fact we we already see even though new players in the vector database space that that that basically started to try to take us down all took the open source angle we already see them even young as they might be they are already strug struggling with their open source strategy serverless is the fourth almost complete rewrite of the entire database of pine con mhm yeah the one other thing that's coming in terms of the llm world which may or may not impact you I'm sort of curious how you think about it is increasingly long context windows for foundation models does that change how people interact with embeddings and Vector databases or does it not really impact things much there's things people are talking about in terms of infinite context or other things like that like I mean I don't know what infinite context means to be honest it's like very big it's infinite it's like huge oh oh never ends thank you yeah you're welcome I should take a note first of all those companies sell uh their services by the token so the fact that they allow you to use infinite context Windows is not shocking okay uh that's good for business the second thing is there there's plenty of evidence that increasing the context size doesn't actually improve results unless you know you do this very carefully right so just what's called conent stuffing is not helping you just pay more and don't actually get Mone for it and the last thing it that even that even if you you kind of Buy in to the the the the marketing that runs its course right if you're it's like saying oh I don't need Google cuz I can in every time I query Google I can send the internet along with my query right it's like yeah that well theoretically that may be possible but clearly practically that's that's not feasible right so at some point the context window just becomes gigabytes and gigabytes and gigabytes of data like terabytes I mean where where do you stop right and so already today we have users who use not even very large models you know maybe a few billion parameters and their Vector database next to their model contains trillions of parameters right and they get you know much better performance that way right just attaching all the context to everything you do I think runs its course very very quickly and it's also unnecessary to be honest yeah I guess related that another place where people have been talking about um embeddings in Vector databases is in uh sort of aspects of personalization and privacy and I'm a little bit curious how you think about that because you know one of the the RIS people view is running an llm over a large Data Corpus refine toting it against a specific company's data is that issue of data leakage you know say for example you're an HR company and you don't want different people's salaries to leak across an llm because you're using it as like a chatbot to help you with context regarding your own personal data in an Enterprise or things like that can you talk a bit more about how embeddings can provide personalization and in some cases potentially other features that may be attractive to to Enterprises yeah so that that's a very common and reasonable thing to be concerned about uh data leakage can happen in in two main ways a if you use a service for your foundational model that that frankly uh retrains their models with your data or records it right or saves it in some way that is opaque to you right that is a huge uh problem and I think a lot of people are a lot of people are struggling with that the second is if you're building an application in house whatever it might be and you fine-tune your models on added data that added data might end up popping where it shouldn't uh in an to you know other people's questions or whatever what people do with Vector databases is actually incredibly simple right you don't find tuno model on your own proprietary data at which point you know for a fact it doesn't contain any proprietary data because it's never seen any of it okay and then at retrieval time or at at you know whenever you you apply the uh the chat or the agent you retrieve the right information from the database give it as context to the model but only do inference you don't actually retrain and you don't save that interaction at which point that data doesn't exist anywhere it's like an AAL thing and the eded benefit to that is by the way that you can be gdpr compliant you can actually delete data so if if you know so you know if you're a if you're a company uh like a legal company and somebody deletes a document you can just delete it from the vector database and that information will never be available to your foundational model again so you don't even have to uh devise some complex mechanism for forgetting you just don't know it anymore one of the main reasons why people attach Vector databases to a foundational models it's it's gives you this operational sanity uh that is almost completely impossible without it that's interesting yeah I guess um it feels like there's three different approaches that people are using they're not mic exclusive for models that kind of overlap in terms of what the hope for output is uh one is really changing or engineering prompts uh or adding more information into the prompt the second is fine-tuning and the third would be rag slash different aspects of embeddings or other approaches like that how do you think about fine tuning in this context like when should you f tune versus you know use some of the approaches that you've talked about earlier I I can answer both of the scientist and as a a business owner right as a scientist I'm all for fine-tuning we have all the evidence to show that done right it helps tremendously as a business owner I can tell you that it's actually extremely hard to do to do well I mean this is something that unless you have the research team and the the AI experts that know how to F tune you might actually make things significantly worse okay so there is there's nothing that says that more data is going to make your model do better in fact it oftentimes gets uh regresses to something significantly worse with prompt engineering again I think it's necessary especially when you build applications you want to responds to you know conform to some format or or have some property I think that's that's a given you should do that it runs its course after a while I mean it's in some sense you get what you get it's necessary but there's a limit to what you can do with that and rag I think is is incredibly powerful but like I said before when we talked about canopy that's not you know that's not simple either I mean it's simpler than the other ones but still requires work and understanding experimentation and so on this is almost a the Hallmark of a nent market when the simplest solution is still uh somewhat complex yeah makes sense what's next for pine cone or what are what are some major things coming that you'd like to talk about so I mean there's a ton we are we're an infrastructure company uh and so we obsess about ease of use and security and and stability and cost and scale and performance also as an engineer at heart I'm I'm very excited about those things and all of that is coming again serverless is becoming faster bigger better uh more secure easier to use and we're starting to really grapple with uh what very large companies and very you know kind of trailblazing tech companies are going through I said that getting AI to be truly knowledgeable is still complex I think we're starting to Grapple with deeper issues that that the entire information retrieval Community has been dealing with for about 40 50 years now we're starting to see those you know come to the for in in uh in Rag and in AI in general I guess putting aside pine cone and sort of the database world and everything else what what are you most excited about in terms of what's coming next in AI it's it's it's hard to say I I really do want to see a a a distillation in some sense of found ation models and by distillation I know it's it's a I don't mean what usually people say when there distillation of models I don't mean that I mean the separation of of reasoning and and and and knowledge right foundational models get it fundamentally wrong when we learn how to build the subsystems of AI correctly and for each one of them to do their roles optimally either we're going to do be able to do the to achieve the same tasks much cheaper faster better or we're going to still want to use the same amount of resources but achieve much more what happens today is that we we have very crude tools and we try to use everything for everything delightfully or shockingly enough depending on who you are that kind of works I mean we we found these like very very efficient very general purpose tools right but they're still very general purpose there're still super blunt instruments again as a technologist or somebody who cares deeply about how things are built you kind of see the inefficiency and it hurts the brain to figure out that you know we take uh half the internet and cram it into GPU memory I'm like holy what why this can't be the right thing to do um so I I'm very excited about us as a community truly understanding how the different components interact and how to build everything much more in some sense correctly and I hope uh we get to build some uh exciting products uh by we I mean the community gets to build some exciting products this year I think we're going to see a year of a lot of experimentation that when uh uh that people went through last year they're going to take the production and to build cool products this year and I can't see I can't wait to see how that looks like I I I have a feeling that this feel is going to be very very exciting for consumers of AI yeah I totally agree it's a it's a very exciting year ahead so thank you so much for joining us today thanks EO thank you guys find us on Twitter at no prior pod subscribe to our YouTube channel if you want to see our faces follow the show on Apple podcasts Spotify or wherever you listen that way you get a new episode every week and sign up for emails or find transcripts for every episode at no- pri.com

Original Description

Accurate, customizable search is one of the most immediate AI use cases for companies and general users. Today on No Priors, Elad and Sarah are joined by Pinecone CEO, Edo Liberty, to talk about how RAG architecture is improving syntax search and making LLMs more available. By using a RAG model Pinecone makes it possible for companies to vectorize their data and query it for the most accurate responses. In this episode, they talk about how Pinecone’s Canopy product is making search more accurate by using larger data sets in a way that is more efficient and cost effective—which was almost impossible before there were serverless options. They also get into how RAG architecture uniformly increases accuracy across the board, how these models can increase “operational sanity” in the dataset for their customers, and hybrid search models that are using keywords and embeds. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @EdoLiberty Show Notes: 0:00 Introduction to Edo and Pinecone 2:01 Use cases for Pinecone and RAG models 6:02 Corporate internal uses for syntax search 10:13 Removing the limits of RAG with Canopy 14:02 Hybrid search 16:51 Why keep Pinecone closed source 22:29 Infinite context 23:11 Embeddings and data leakage 25:35 Fine tuning the data set 27:33 What’s next for Pinecone 28:58 Separating reasoning and knowledge in AI

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from No Priors: AI, Machine Learning, Tech, & Startups · No Priors: AI, Machine Learning, Tech, & Startups · 53 of 60

← Previous Next →

No Priors Ep. 13 | With Jensen Huang, Founder & CEO of NVIDIA

No Priors Ep. 13 | With Jensen Huang, Founder & CEO of NVIDIA

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 8 | With Neeva’s Sridhar Ramaswamy

No Priors Ep. 8 | With Neeva’s Sridhar Ramaswamy

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 7 | With Stanford Professor Dr. Percy Liang

No Priors Ep. 7 | With Stanford Professor Dr. Percy Liang

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 1 | With Noam Brown, Research Scientist at Meta

No Priors Ep. 1 | With Noam Brown, Research Scientist at Meta

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 9 | With Perplexity AI’s Aravind Srinivas and Denis Yarats

No Priors Ep. 9 | With Perplexity AI’s Aravind Srinivas and Denis Yarats

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 10 | With Copilot's Chief Architect and founder of Minion.AI Alex Graveley

No Priors Ep. 10 | With Copilot's Chief Architect and founder of Minion.AI Alex Graveley

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 11 | With Matei Zaharia, CTO of Databricks

No Priors Ep. 11 | With Matei Zaharia, CTO of Databricks

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 12 | With Noam Shazeer

No Priors Ep. 12 | With Noam Shazeer

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 14 | With Sarah Guo and Elad Gil

No Priors Ep. 14 | With Sarah Guo and Elad Gil

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 2 | With Runway ML’s Cristobal Valenzuela

No Priors Ep. 2 | With Runway ML’s Cristobal Valenzuela

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 3 | With Stability AI’s Emad Mostaque

No Priors Ep. 3 | With Stability AI’s Emad Mostaque

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 15 | With Kelvin Guu, Staff Research Scientist, Google Brain

No Priors Ep. 15 | With Kelvin Guu, Staff Research Scientist, Google Brain

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 4 | With Zipline’s Keller Rinaudo Cliffton

No Priors Ep. 4 | With Zipline’s Keller Rinaudo Cliffton

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 16 | With Mustafa Suleyman, Founder of DeepMind and Inflection

No Priors Ep. 16 | With Mustafa Suleyman, Founder of DeepMind and Inflection

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 17 | With Karan Singhal

No Priors Ep. 17 | With Karan Singhal

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 5 | With Huggingface’s Clem Delangue

No Priors Ep. 5 | With Huggingface’s Clem Delangue

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 6 | With Daphne Koller from Insitro

No Priors Ep. 6 | With Daphne Koller from Insitro

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 18 | With Kevin Scott, CTO of Microsoft

No Priors Ep. 18 | With Kevin Scott, CTO of Microsoft

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 19 | With Anduril CEO Brian Schimpf

No Priors Ep. 19 | With Anduril CEO Brian Schimpf

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 20 | With Sarah Guo and Elad Gil

No Priors Ep. 20 | With Sarah Guo and Elad Gil

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 21 | With Datadog Co-founder/CEO Olivier Pomel

No Priors Ep. 21 | With Datadog Co-founder/CEO Olivier Pomel

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 22 | With Instacart CEO Fidji Simo

No Priors Ep. 22 | With Instacart CEO Fidji Simo

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 23 | With Snowflake's CEO Frank Slootman

No Priors Ep. 23 | With Snowflake's CEO Frank Slootman

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 24 | With Devi Parikh from Meta

No Priors Ep. 24 | With Devi Parikh from Meta

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 25 | With Palantir's CTO Shyam Sankar

No Priors Ep. 25 | With Palantir's CTO Shyam Sankar

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 26 | With Weights & Biases CEO Lukas Biewald

No Priors Ep. 26 | With Weights & Biases CEO Lukas Biewald

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 27 | With Sarah Guo & Elad Gil

No Priors Ep. 27 | With Sarah Guo & Elad Gil

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 28 | With Khan Academy’s Creator Sal Khan

No Priors Ep. 28 | With Khan Academy’s Creator Sal Khan

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 28 | With Khan Academy’s Creator Sal Khan (Japanese Version)

No Priors Ep. 28 | With Khan Academy’s Creator Sal Khan (Japanese Version)

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 29 | With Inceptive CEO Jakob Uszkoreit

No Priors Ep. 29 | With Inceptive CEO Jakob Uszkoreit

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 30 | With Vercel CEO Guillermo Rauch

No Priors Ep. 30 | With Vercel CEO Guillermo Rauch

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 31 | With Cerebras CEO Andrew Feldman

No Priors Ep. 31 | With Cerebras CEO Andrew Feldman

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 32 | With NEAR’s Illia Polosukhin

No Priors Ep. 32 | With NEAR’s Illia Polosukhin

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 33 | With Replit's CEO & Co-Founder Amjad Masad

No Priors Ep. 33 | With Replit's CEO & Co-Founder Amjad Masad

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 34 | With Ginkgo Bioworks Co-Founder and CEO Jason Kelly

No Priors Ep. 34 | With Ginkgo Bioworks Co-Founder and CEO Jason Kelly

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 35 | With Sarah Guo and Elad Gil

No Priors Ep. 35 | With Sarah Guo and Elad Gil

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 36 | With Hubspot's Co-Founder Brian Halligan

No Priors Ep. 36 | With Hubspot's Co-Founder Brian Halligan

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 37 | With Kawal Gandhi

No Priors Ep. 37 | With Kawal Gandhi

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 38 | With Material Security Co-Founder Ryan Noon

No Priors Ep. 38 | With Material Security Co-Founder Ryan Noon

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 39 | With OpenAI Co-Founder & Chief Scientist Ilya Sutskever

No Priors Ep. 39 | With OpenAI Co-Founder & Chief Scientist Ilya Sutskever

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 40 | With Arthur Mensch, CEO Mistral AI

No Priors Ep. 40 | With Arthur Mensch, CEO Mistral AI

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 41 | With Imbue Co-Founders Kanjun Qiu and Josh Albrecht

No Priors Ep. 41 | With Imbue Co-Founders Kanjun Qiu and Josh Albrecht

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 42 | With Sarah Guo and Elad Gil

No Priors Ep. 42 | With Sarah Guo and Elad Gil

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 43 | With Clara Shih, CEO of Salesforce AI

No Priors Ep. 43 | With Clara Shih, CEO of Salesforce AI

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 44 | With Former Square CEO Alyssa Henry

No Priors Ep. 44 | With Former Square CEO Alyssa Henry

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 45 | With Reid Hoffman

No Priors Ep. 45 | With Reid Hoffman

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 46 | Best of 2023 with Sarah Guo and Elad Gil

No Priors Ep. 46 | Best of 2023 with Sarah Guo and Elad Gil

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 47 | With Sourcegraph CTO Beyang Liu

No Priors Ep. 47 | With Sourcegraph CTO Beyang Liu

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 48 | With Covariant CEO Peter Chen

No Priors Ep. 48 | With Covariant CEO Peter Chen

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 49 | With Shopify VP of Core Product Glen Coates

No Priors Ep. 49 | With Shopify VP of Core Product Glen Coates

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 50 | With Stripe Head of Information Emily Glassberg Sands

No Priors Ep. 50 | With Stripe Head of Information Emily Glassberg Sands

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 51 | With Notion CEO Ivan Zhao

No Priors Ep. 51 | With Notion CEO Ivan Zhao

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 52 | With Pinecone CEO Edo Liberty

No Priors Ep. 52 | With Pinecone CEO Edo Liberty

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 53 | With AMD CTO Mark Papermaster

No Priors Ep. 53 | With AMD CTO Mark Papermaster

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 54 | With Sarah Guo & Elad Gil

No Priors Ep. 54 | With Sarah Guo & Elad Gil

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 55 | With Figma CEO Dylan Field

No Priors Ep. 55 | With Figma CEO Dylan Field

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep 56 | With Baseten CEO and Co-Founder Tuhin Srivastava

No Priors Ep 56 | With Baseten CEO and Co-Founder Tuhin Srivastava

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 57 | With LangChain CEO and Co-Founder Harrison Chase

No Priors Ep. 57 | With LangChain CEO and Co-Founder Harrison Chase

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 58 | The argument for humanoid robots with Brett Adcock from Figure

No Priors Ep. 58 | The argument for humanoid robots with Brett Adcock from Figure

No Priors: AI, Machine Learning, Tech, & Startups

No Priors Ep. 59 | With Sarah Guo & Elad Gil

No Priors Ep. 59 | With Sarah Guo & Elad Gil

No Priors: AI, Machine Learning, Tech, & Startups

The video teaches how to use RAG architecture and vector databases to improve syntax search and make LLMs more available, with applications in retrieval augmented generation, fine-tuning, and semantic search. It highlights the importance of ease of use, security, and stability in infrastructure companies and discusses the future of RAG search and AI.

Key Takeaways

Build a vector database using Pinecone
Implement RAG models for syntax search
Fine-tune LLMs for improved performance
Evaluate RAG models and vector database performance
Develop advanced RAG applications
Integrate RAG with other AI models

💡 RAG architecture and vector databases can improve syntax search and make LLMs more available, with applications in retrieval augmented generation, fine-tuning, and semantic search.

🔒 Pro feature: Ask AI to explain this lesson →

More on: RAG Basics

View skill →

High Performance (Realtime) RAG Chains: From Basic to Advanced

High Performance (Realtime) RAG Chains: From Basic to Advanced

Coding the Ultimate RAG Engine from Zero

Coding the Ultimate RAG Engine from Zero

Building Agentic RAG From Scratch in Pure Python

Building Agentic RAG From Scratch in Pure Python

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

I Built a RAG App to Decode Airline Bureaucracy (So You Don't Have To)

I Built a RAG App to Decode Airline Bureaucracy (So You Don't Have To)

Akamai Developers

RAG Demo for Beginners: Full Hands-On Tutorial in Tamil | Build Your Own RAG AI | Karthik's Show

RAG Demo for Beginners: Full Hands-On Tutorial in Tamil | Build Your Own RAG AI | Karthik's Show

Related AI Lessons

Why you shouldn’t search your documents directly with AI

Learn why directly searching documents with AI can be inefficient and how retrieval-augmented systems can improve the process

Medium · Programming

Your AI Keeps Making Things Up. RAG Is How You Make It Use Real Facts Instead.

Learn how to use RAG to make your AI provide accurate answers based on real facts instead of making things up

Evaluation Metrics for RAG: Measure Retrieval, Generation, and End-to-End Quality With Numbers That…

Learn to evaluate RAG models using metrics that measure retrieval, generation, and end-to-end quality

Evaluation Metrics for RAG: Measure Retrieval, Generation, and End-to-End Quality With Numbers That…

Learn to evaluate RAG models using metrics that measure retrieval, generation, and end-to-end quality

Medium · Data Science

Chapters (11)

Introduction to Edo and Pinecone

2:01 Use cases for Pinecone and RAG models

6:02 Corporate internal uses for syntax search

10:13 Removing the limits of RAG with Canopy

14:02 Hybrid search

16:51 Why keep Pinecone closed source

22:29 Infinite context

23:11 Embeddings and data leakage

25:35 Fine tuning the data set

27:33 What’s next for Pinecone

28:58 Separating reasoning and knowledge in AI

RRF vs DBSF with Qdrant: Hybrid Retrieval Fusion for RAG in Python

Professor Py: AI Engineering