Integrating Knowledge Graphs & Vector RAG for Efficient Information Extraction / Reading Grp Sept 12

MLOps.community · Beginner ·📄 Research Papers Explained ·1y ago

Skills: LLM Foundations90%LLM Engineering80%RAG Basics80%Fine-tuning LLMs70%Multimodal LLMs60%

Key Takeaways

The video discusses the integration of knowledge graphs and vector retrieval augmented generation (RAG) for efficient information extraction, using tools like Vector RAG, Knowledge Graphs, and LLMs, and techniques like fine-tuning and hybrid models.

Full Transcript

okay let's get started so um once again everybody welcome back to the September edition of the mlops community reading group so we meet once a month if you weren't here last time we meet once a month to discuss and you know dive deep into the latest research in the AI in the ml space and we had a really fun time last week we have people from all over the world covering pretty much all the time zones joining us and from various different fields and in various different experience levels coming here to give us opinions and sharing insights it's it's really awesome to have you all here and today we'll be discussing another cool research paper I've just dropped the link yeah and uh we're also live streaming to the data engineering conference virtual conference that's happening right now so if you're joining if you're watching us from there um come say hello in our slack Channel we have a reading group Channel just come say hello I we usually keep the conversation rolling after the session um we as usual we have a bunch of amazing community members um joining us to host the session and uh well obviously we got sonum as usual uh Sonam is uh from development relations at AI explain we also got Matt s uh he's a CTO and co-founder atbs hi Matt thank you for being here and W maristan he's an AI developer at smart data and also nahill is running a few minutes late pretty early for him uh Nails the co-founder at an AI started that's still in Ste mode so we have a still lineup of presenters for you today so at any point if you guys want to share your opinions insights whatever it is please unmute at any time and when we'd like to keep things fairly open-ended so without further Ado over to you son so I don't know well first of all welcome everyone good morning good evening wherever you guys are from uh it's early morning for me I can promise that uh and so okay so I don't know if you guys get a chance to read the paper but um you know it's Rag and then rag been has been the top of the towns and since it came out there have been a variety of versions of rag that companies and researchers have built um uh around so uh now the question is does everybody understand rag I mean I'll be honest I'm still learning about it and you know with this paper and in today's session we are discussing the paper called hybrid R integrating knowledge grafts and Vector retrieval augmented generation for efficient information extraction now in this paper hybrid drag combines the strengths of both Vector retrieval and knowledge graphs to create a more accurate and in efficient information uh sorry efficient system for information extraction so now going on to the motivation what was the motivation behind this research it basically arises from the challenges faced by Financial an analysts when they Tred to extract the valuable insights from this unstructured documents uh such as earnings call transcripts or reports and you know they took the the good thing about the paper that I found is that they took a very domain specific example and they worked through it so it's quicker to un like in in my opinion it's easier to understand when you take like a concrete example but it has its own disadvantages and we can discuss that later uh and yeah okay so traditional model so like what was the challenge why did they want to figure like why did it come up with hybrid R so traditional methods or models even with the help of rag techniques it they they struggle to handle the specialized terminology or you know any complex structure found in these documents and financial documents are definitely crucial for Industries when it comes to decision making but llms face issues like hallucination as we are all aware of and you know lack of context so like while Vector rag helps by retrieving text uh you know helps with the whole um similarity the contextual component of it it it doesn't fully account for the you know the hierarchies that are metion that are involved in the document or the data and the nuances of these documents so this is where the knowledge graphs comes in like knowledge graphs on the other hand that present information as entities and relationships as we discussed earlier which can help overcome some of these limitations now combining all of this the authors introduced hybrid Dr to improve this process by using both um knowledge crafts and Vector retrieval to enhance this um the the accurate see the appropriateness or the accurate results from a question answering system for example especially when you're dealing with uh financial data and specifically they demonstrate this uh using earning SC transcripts which uh you know the format they are in they are structured as Q&A Pairs and you know this is where they show how hybrid drag uh could outperform traditional methods in extracting uh relevant and accurate information now if this is just like you know around the motivation why they wanted to solve and they took this domain specific example but going deeper into the methodology starting with Vector right I briefly introduced it like what it is um but you know if you see the paper in the methodology section they describe uh the vector rag in more detail uh they describe it where it this approach as you know it's the approach where external documents are divided into chunks and then converted into Mings which are of of course the you know numerical representations the system retrieves the most relevant chunks based on the similarity to the user query and now these chunks are used as context for the llm to generate responses there is a whole another you know we can take another session talking about chunking I'm sure but um with this one just you know the brief I'm going to talk about it briefly so now what is the limitation if you just using the vector R uh you know it often lose the hierarchical structure um especially you know if you're talking about financial documents and it could lead to less accurate retrieval of the context and moving on to like you know how you can construct the knowledge cph uh there's not much mentioned in the paper uh in my opinion about this whole how they're constructing it but uh what I understood it understood from it is that this process involves knowledge extraction and knowledge Improvement now what is knowledge extraction it's basically identifying what are the entities and what are the relationships uh from this unstructured data uh using different LLP techniques and now what is knowledge Improvement you knowledge Improvement is basically you are refining the graph by removing if there are any redundancies and then completing the missing links and then fusing information from multiple resources now the author describes how they used llms to extract entities like you know companies or financial metrics Etc and relationships are like you know Company CEO or product launches and so on from these transcripts which were then structured into uh you know the there's I think a Knowledge Graph specific terminology subject predicate object now like you know there's more like you know they talk about the graph rag uh as the next step but I'm going to pass on to wemar uh to discuss um further yes thank you okay so I was going to continue where son left off we graph rag we talked about Vector rag but graph rag is a hard new topic that I've noticed recently it's one of the reasons I was interested in this paper um because I haven't really tried it I worked with knowledge graphs a bunch I'm really interested in this intersection of structured knowledge and these statistical methods can help with reliability and uh yeah to have some kind of symbolic um knowledge behind it and that's basically the essence of this paper is just injecting Knowledge from knowledge craft into the context of a language model to get the get slightly better answers effectively and sh screen here you see my mouse there is a schematic here showing how they constructed the graph um just can have a little look at we start with these earning reports which I'll get into in a minute I'm just going through the paper section by section so they got like a bunch of documents not that many though and they the goal is to get those graph out of it and with a Knowledge Graph you can do lots of things you can use graph algorithms or like measure properties of the network and whatnot you start with text they have a prompt for text pre-pressing processing which is the first step get a process report which is just an abstract version of the text with not as many um redundancies and then finally they process that again to get the actual triplets or the subject the predicate object statements which or entity relationship uh connections which build up the graph and that can help with answering questions um which is the ultimate goal is just to ask their llm questions about the data and get a reliable answer back so llms can sometimes struggle with these kind of relationships uh about the parent of and the son of these kind of questions that very easily found in the graph just hopping couple hops in the graph but by adding like if if you're asking a question about a particular company you can cetch all of the owners of the company or the partners employees or whatever and have that all the connections there in the context that's effectively what they do so a subgraph which consists of the relevant notes and edges is extracted from the full LGE graph to provide context and then that's just put into the uh into the L&M so it's a different way of searching instead of similarity search based on the entities and relationships and then is there technique which is just hey let's try combining the traditional Rag and the graph rack into like into the same context for that LM what they call the amalgamation of the two contexts which allows them to leverage the strength of both approaches um the vorite component provides a broad similarity based retrieval of rition this is what everybody's been doing well the graph element contribute structur relationship reach contextual data um it's pretty cool and they had a few different Matrix that use to measure these are all kind of automatic Matrix I guess they've cited into gbd4 and so they have faithfulness which is a measure of and checking hallucinations basically it's about what do it say it's a faithfulness is a crucial metric that measures the extent to which the generated answer can be inferred from the provided context so an llm can answer a question like seemingly perfectly because he wants to be super helpful but then turns out it doesn't come from any data and it's just telling you what you want to hear um so they have they break the the uh question answer part answer to fundamental statements is if yeah I don't know it's about the earnings of a company and who owns the company what not to break it apart it's a complicated question and an answer and then they just prompt LM to check the evidence which is the with the retrieved info from the vectors or from the graph and say yes or no or how much it it fits get some kind of score it's good to have this I'm bit skeptical of like using llms to valuate LMS but still it should spot some hallucinations then we have answer relevance which is all metric use to evaluate the system and show that their approaches better in some aspects and the other ones this was a this one was a bit confusing it's about how kind of how well does the answer much the question and they generate different questions for the answer that could fit it and use a similarity magc to compute it I don't want to go into this much but this like they generate a question for a given answer and then they find out if if there are if if many different questions could fit this answer and if they're all the same then it's a relevant answer so I never seen this before seems kind of cool that precision and recall but in this different definition it's something used in raas this automated framework where validating but it's effec just good on precision and record where um you not like what really matter for me like I've been doing lots of retrieval uh it's just a recall you want to see how often is the true answer in the retrieved documents so like you fet 20 chunks or 10 chunks or five chunks how often is it there and then there's a Precision where we want to know how many relevant documents are found so per there a there's a Traer between those usually and actually they were using like chbt 3.5 or GPT 3.5 but uh for me the Precision doesn't really matter because DBT 4 can kind of fter out the noise in the retrieved info and just what matters so you just really want the the important parts to be somewhere in the retrieved info does anyone have any questions or comments at this point I'll talk briefly about the data next before handing the word over yeah I want just to comment on the Precision part I think that makes sense like uh you should focus on recall before Precision but it's not uh uh it's not a black and white in the sense that uh even in the last paper we read as the context gets longer and longer right in at least in the long context problems there's a loss on the middle problem um and so that is one problem where like you might have too much context but then if you don't have high Precision then llm will get confused and the second problem can be that if you're working with slms which have smaller context then you need to be able to uh get the relevant stuff in smaller amount of context window otherwise again you may not be able to put it so that's how I see it but definitely you need to recall first before even yeah for sure and also yeah I mean if you're in retrieving like 30,000 tokens for every question it's much better to just get 5,000 tokens and save time and and uh compute so for sure another uh remark or question for if someone noticed like I was very interested in the faithfulness matric I think it's a very interesting uh IDE to like split up the the answer into statements and then uh having evaluate whether the statements are covered by by the data um however I I do notice that like I I've been experimenting myself with a little bit with splitting up in statements U or for an exercise I was doing and I noticed that sometimes the llm creates longer statements when it extract statements and sometimes shorter ones um and I do think that there might be still like some room there to to kind of because currently it seems like the less statements are this covered into the into the answer the better the score will be because it's as long as the the couple of statements are in there contain some amount of Truth it will have a very high faithfulness but I think there might be still like it would be interesting to be able to integrate something there with also the number of words per statement or something like that so that like the fatness is is kind of normalized on the amount of word used also depends a little bit how they Define being right depending on exactly how they query the statement verific ation is it if there is something true in a longer statement is it right even if there is also something wrong or or like yeah that depends exactly I think those details are provided here yeah that's a good point um yeah just have thought I had because we're talking about like knowledge graphs and these logical triplets we could maybe break it into very simple statements like yeah let's say you know XS y it's quite interesting the uh I find it quite interesting that they've had to con construct their own evaluation metrics here now evaluation of LMS as a general field is not very well developed um so sure they've had to figure out some techniques here but at the same time none of these techniques seem to be gra specific I don't think but but I do think that that uh what vinir says is is kind of right like this this statements could have done gra been done with graph extraction like and that would also be an interesting variant you could do a a regular text statement extraction and a graph statement extraction and have these two faithfulness matrics next to each other maybe weighted or or compared that could be really interesting and I do think like what you're saying is also right like evaluation seems to be very young in developing so it's normal that people have to invent their own but is these are very creative metrics I think that are interesting to to see okay I'll continue part of data description great to have these talks about the contents because um it's probably more interesting than than L of the content so they basically made they need to to create a data set for their their purpose is financial documents where you can construct a Knowledge Graph also I guess constructing the knowledge graph was maybe the tricky part um so what they said highlight here if you see it in short there's no publicly available Benchmark data set to compare Vector gra rack and grab rack either for financial or general domains to the best of our knowledge I'm kind of guessing that there is something somewhere it just needs to have like there's something needs to be linked to a graph there's like Wikipedia and wikid dat for example anyway um but this was all done for the financial purpose there were working for hat fun or something so what they did was they took transcripts from earning calls of nifty50 uh which is are like top companies in the Indian Stock Exchange uh took one quarter worth yeah it's kind of a small data set it's like 50 transcripts it's long documents one for each company guess we could just see most of it here in this table there were 50 companies one document for each company and this kind of relevant to keep in mind that this is not a very large data set um because yeah if it were like 5,000 it's still not that large but we look at the results later it's kind of relevant uh they have 16 questions per document they constructed the graph from it they go into some detail here about how they scrape stuff and blah blah blah doesn't matter that much but this as diverse amends it's finance and we talk about verbal money it's like healthcare oil telecommunication Etc um yeah and I think my part is done U maybe I had a couple of comments well about the metrix I would have liked to see some kind of human view it say they're using GPT 3.5 for everything else I don't know what they used for for the auto evaluation for guessing GPT 4 um and uh well yeah it's mostly it all the comments are for the other sections I guess we'll have a around later to tell what we think so M are you going to say where implementation um yes surprising no human review from Samantha I think there's a number of surpris is in this paper which um not in a good way I'd suggest so um maybe we'll we'll get into that I I kind of similar to to valdir I was really excited by the idea of this paper you know the we're doing a lot of work internally to try and understand how those knowledge graphs and vector-based rag intersect the paper claims that they have a novel method here that hasn't appeared in the literature before um which I I'm not entirely sure that's true but um again that's something we we can debate in any case let's talk about how they built it and how they implemented this so yeah I mean you know they firstly they needed to construct they needed to do two things at a very high level they needed to construct a Knowledge Graph and they needed to chunk the text up into a more traditional Vector data base so that we can do standard So-Cal standard rack on on that um in fact they needed chunk the data in both cases but for different reasons so as they explain here they used a library called P PDF loader to import these documents um and then they break that document down into chunks now the purpose for which they're chunking here is so that they can examine those chunks one by one and build up the graph then those chunks are going to be discarded that's I just wanted to disambiguate that because we're going to talk about chunks later in the context of vexar rag it's for a different purpose in each case and so hopefully that makes sense um you know there's little bits of technical detail here that don't particularly matter for our purposes like the specific chunker that they're using the chunking strategy they are using um Lang chain here as well so this chunking strategy is part of Lang chain is notable that you know there are more advanced ways to do chunking that they're not considering in this paper it's things like semantic chunking that sort of thing that aren't considered so that that's notable um in any case they break it down into these chunks and their their intention here is to scan through the chunks and extract entities that are relevant to this financial domain they have a predefined list of um entities they're interested in so you know they want to pull out companies financial metrics products Services locations and so on right um but they don't just want the entities they want the relationship between the entities so they want to be able to say that this company here company a has um a board member Bob another board member Alice and maybe Alice is also a board member of Company B and maybe the earnings for company a is such and such this quarter last quarter it was some other number so on and so on I'm trying it here to describe to you and draw a picture in your minds of what this graph would look like but I think one of the shortcomings of this paper is they don't do that for us so I would really have liked here for them to give an example to say here is our earnings report here is what we extract from it here is what the graph looks like because that would really help in terms of intuition building to help us as readers to be able to understand why as they claim the combination of graph plus Vector database actually adds value here um in any case we'll move on on through the the thing oh the things to highlight here is the the verbs so they when they say verbs they're talking about relations between these entities so they'll have a predefined set of things they're interested in like you know is an is an investor in something or if somebody has a directorship on the board of some company those are the sorts of verbs they are interested in again I specula because they don't tell us what those verbs are unfortunately um they then talk about um you building this pipeline in Lang chain to extract those features so um well I guess I've kind of covered that but they they're pting out entities pting out verbs they ultimately persist all of that as a pickle file so you think about a graph you think about this grandiose thing where lots of entities are connected to other entities which are connected to other entities and so on but you can simplify that as these triplets which have been mentioned before so we have um you know subject object subject predicate object or however we want to describe it but the idea is that we have a thing which is connected to another thing that's three three items that we need to persist so they store these triplets as python data structures they persist it into a pickle file and later on you'll see they load that back in um so they go on to describe the two approaches Vector Rag and and graph rag in both cases they're using this GPT 35 model um ra somewhat perhaps somewhat outdated model some to consider there uh but in any case with the vector approach they are doing what you would usually do for rack and for those who don't know what you would usually do I'll just briefly describe that um son provided a wonderful intuition for that already so what we're going to do is we'll take the text we split it up into chunks of a fixed size they're using 1024 and then each of those chunks gets assigned a vector that represents that chunk it represents the the meaning of that chunk that allows us to later on say I have a question and I want to find chunks of texts that are semantically related to that question we retrieve those from what's called aexa database and then we perform out our rag query so we try to answer that question so in their pipeline for vexa rag what they want to do is given a question which will be a question about the financial statements incidentally they want to retrieve the chunks they want to build what we call a context and that is just a concatenation of the relevant chunks and then usually the next step would be generate a response using um a a large language model so we ask a large language model to say given the user question given the context can you answer the question however they stop there they stop there because later on they need to combine these two approaches together so let's talk about the second approach which is to graph rag um here the goal is to build up a graph we've already kind of done that we've saved it as a pickle file so actually all they do here is load up all of those entities and build a graph they're using a python Library called Network X to represent that graph in memory and that allows them to do a traversal all the traversal means is that we we can pick a starting entity in that graph and we can explore the graph so we can say from here I'm going to go to my neighbors I'll go to the next neighbors and so on so we'll explore the graph we'll Traverse it um so their pipeline is then to say We'll Traverse the the um the graph to find things that are relevant to the question that's been asked rather than finding chunks of text that are relevant we'll find these entities that are relevant and those entities have ultimately been dered from the text but we're looking not just at the entities but the relationships between them um hopefully all that makes sense so far I'll just pause to if if there's any questions before I I go on there was a small question I did have around there like in these numbers they talk about the number of triplets in the graph and the number of edges and those are almost identical but I would have thought they would be identical any idea what the difference is that's a really good question I don't no I suppose going to speculate a little bit are there things that don't have any relationships in that graph I don't know if anyone else has any thoughts on that or anyone who's read the paper directed or undirected graph do they specify like what type of graph see they don't actually specify it sound a good point yeah is it a bunch of so it's one big graph it's not separate graphs for each document or each company so like how do they manage like entity resol resolution and duplicated entities like Orange County versus OC and references like that I'd say that's one of the major short like in my mind because I used to deal like I did couple of papers entity disambiguation and like it's easy to construct the graph with all the relationships in the text but that's not a good graph you need to have reliable entities I think it into it I mean I feel like these are all really good questions that are basically not answered in the paper and we can speculate about it but that is in itself a shortcoming of the paper I feel yeah Matt I was just asking if they mentioned how big the graph in the end was like how many entities there were or how many nodes there were yeah yeah yeah uh well I suppose this number here should be how many nodes they had yes it's 11,400 yes but then why the number of triplets and the they don't it doesn't quite line up well yeah we need to see the graph right we need to know how dense this is because yeah so know it looks like it's a very highly connected graph which to to vair's point is a questionable thing isn't it um okay so I'm not able to add any Clarity there but um that's more I'm going to blame the paper or more than myself on that one um okay so the final thing that I want to cover in describing the implementation is this last Point 4.4 where and I I'm going to sound like I'm I'm leaning on paper quite heavily here but in a in a kind of critical way but again I looked at this and I thought this is a kind of short for what it's doing you know because this is the bit where they talk about how they combine the two techniques and essentially all it says is well we take the output from that context that we built from vector and that context which we built from the rag search from sorry from the graph search we just combine them together we can cadate them um so it says here we can catenate the two context to form a unified context and they place the the vector first and the graph results secondly which is kind of arbitrary they so firstly that feels kind of naive in terms of just just concatenating two things together there are things they could do like apply reranking or do other things to combine the results in a more sophisticated way but they don't do that um the other thing they note is that the Precision is impacted by the ordering um and we this is Echoes of the the missing middle that that which we discussed in the previous reading group a little bit where the amount of stuff you have in the context impact how a large language model pays attention to that it doesn't equally weight everything in that context um and that's a problem I feel they're probably coming up against with this concatenation if they swap the order of concatenation they probably see different results is kind of interesting and that they did and Tred both ways at the very least um so yeah anyway that's hopefully covers the the implementation I'll stop sharing my screen but we we'll have a we'll continue the discussion if there's more questions yeah so I had some thoughts about like as Matt Sher how the emperor was has been written like the the size of the data the way they approach certain things uh the models they use Etc and so I was searching um other literature that has been recently published around like using graphs or graph based neural networks Etc um to Rag and actually Matt briefly touched on all of the points that I so one is that um it's pretty unate you need to do some kind of disambiguation um and uh D duplication of the context and there are papers now which are saying hey like we used a special type of prompt or something else to take the graph and slatten it into some kind of hierarchical um text structure and so that would be an interesting thing to read I mean I was just reading the abstracts and kind of only understanding the insights I didn't read all the papers because then I might be writing my survey paper at that point but uh yeah so so people are spending time trying to understand how to actually just combine them more effectively which is not done in this paper the second thing was uh there is a paper which is already claiming that ranking with graphs is really helpful um and so if you use gnns which is graph NE networks again I don't have much that's I we learn what they are but they're using GNN um as a step after which is a common step so reranking has now become a common step in advanced drags they call it where like you do the retrieval then you rerank the retrieval and then do the generation and so on so forth and so uh they're saying can you rerank the different pieces of chunks and uh have a more like a better context using GNN uh um and then the last I think that's it and then there's a lot of conversation also about in those papers which is not covered here about like how do you go about pruning the graphs like once you figure out for a given L how do you improve the retrieval given that hey this is the query and this is the graph like how do you make sure that you covered the most relevant pieces and how do you work with a a graph database I guess or or whatever you're using the story graph to retrieve it better so um those were some things we should definitely dive deeper into in other papers uh this paper kind of lacks a bunch of detail around that um which I think we all are kind of Desiring and speculating here anding oh maybe this is what was um anyway so in the results piece I mean there isn't anything really um new in my opinion I read through the results and what it says is basically that u in faithfulness there is in so this this um table basically covers most of the stuff that they're talking about here here um and that's kind of very short and sweet to look at um these are the metrics that Vladimir talked about and so faithfulness is kind of the same and then um in terms of recall they're saying that the Recon of graph is not the best uh but they're not fully describing why that is um but when you combine it because they just concatenating it you will always have all the context available the hybrid R so you will always have a good recall so you can Bally because the concatenation get the the best recall um and then in terms of precision because um the number of chunks that retrieving from Vector is not the best it is hurting um the overall precision as well but as we discussed uh I think recall is more important than drag than decision in most cases in most practical cases um and then yeah faithfulness there isn't that big of a difference I would say and uh that the interesting thing is the um I forget what what this was uh yeah it's the similarity of hypothetical questions that could correspond to his answer so it's like they make oh yeah a weird one yeah yeah yeah so that's a new metric which is kind of creative and interesting they are not using any graph related metrics either and so it's hard for me to reason about it because they came up with the metric um and then we haven't like really looked at the data so what it really means like if they shared at least a transcript and comparison to understand how they're doing it that would help internalize it faithfulness uh Precision recall are common in ragas and other other eval Frameworks nowadays so that's uh definitely um also useful any any thoughts or questions around this guys actually just uh clarification I thought the the four eval metrics were all from ragas are they not or they do they differ in some way this uh ala relevance also part of rers I think so yeah at least there's a reference like footnote okay so that might be a mess maybe it's ra is yeah they just site R is for the contest precision and contest recall but not the other ones oh but not the others oh I'm not saying I necessarily like the metrix just that I don't think they pulled them out of a hat what do you think about like not having any way to that then all the metrics are rag related but is there anything we should also look at at in how they Traverse the graph and the pieces around reval related to graphs is confin or not not sure about and yeah I think that's that's all I have from results perspective if you don't have any other questions yeah so in the results they're comparing like either you have the vector context or the graph context or both but that's not really fair because if you have both you have twice as much stuff in there so is one of the things is yes kind of stood out to me because maybe if they had like much larger Vector context or like a bigger part of the graph included then you wouldn't yeah basically the number of tokens used isn't really fair um and then about the about the uh just's the concatenation I mean we want to tell St on this but like you could use the like which entities are in the graph to somehow filter the documents found by the vors or or vice versa have some kind of something more alive than just concatenation it's kind of weird right it's like they're they're not taking advantage at the end of the fact that it's a graph they take advantage of the fact that it's a graph when traversing a graph clearly but they don't take advantage of the graph nature of it when they construct the context and as you say if all they're doing is suppose there's a lot of overlap between the two contexts then could they just repeat the same context twice and get the same result for that matter it's not that I don't they've not done a good enough job of convincing me that that end point where they they combine the graph result with the vector result is actually because it's a graph versus because it's duplication of context or additional information like like you suggest me yeah because they like they have these two steps of how to generate the graph they first take the report and make a more abstract version of it and then they take that abstract version and make it into a graph but maybe it was enough to just have the abstract version of the reports in a textual way yeah maybe you don't need to have the graph you just need to have the relational info like this company you know is owned by this company all these statements that are in the graph ultimately but then we have these graph algorithms like yeah um wish I could like I hope some maybe someone in the room knows more about them but like what you could do instead of just taking all the one hop neighborhoods you could actually search the graph based on the thing like you could do graph algorithms is a like fantastic opport and I I can imagine it's being done somehow I mean just like knowledge GRS are parts of Google search and all these big companies have Knowledge Graph behind what they do and then now they have alms too and I bet there combining those two in a Smart Way somehow yeah I feel like they are not use utilizing the graph structure they may have used like sparkle CES to maybe we use to generate Sparkle cues for like better graph traversal but uh yeah that's that and but I'm also like concerned about like the the scalability of this method because like the graph here is rather rather small compared to like a lot of like real life Enterprise graphs and then touching that on the like the Precision the record tradeoff like I wonder if the graphs like get to like billions of maybe like millions of notes like the like precision and recall like the way of like simply con concatenating them could be problematic uh if applied to a real life um application yeah I was also wondering like if we normalized to the amount of tokens provided if like the graph would almost be able to put the whole graph into the context because I imagine that the the graph will be a lot more condensed uh and if you compare it the vector database maybe 10 um yeah then then uh reference text might be as much as almost the full graph would be from the graph database so normalizing might might actually be a very interesting thing to do to really see how the one compar to the other one Thinking Out Loud here I might be interested to see if anyone's done it or seen it is making a large language model itself Traverse the graph so all of these things where we're taking a graph and ultimately condensing it into a context if it's a subgraph whatever it is the context can contain the same structural information as the graph contains but there's no reason the large language model has to follow that structure ultimately it's going to do what it's going to do according to its text generation model whereas imagine if you had a model which can it's kind a Chain of Thought reasoning so you have a model that's been told to I don't know look at this node this is your starting point and you have a bunch of options and one of your options is to go to the database and get the nearest neighbors and it's therefore able to Traverse the graph through an a sort of reasoning cycle if you see what I mean I don't know what you do with it I don't know what applications you might have but you've basically given the model the power to Traverse the graph directly and forced it to work within a graph structure I've heard of some of these agents that are able to like Traverse the map that's similar sure yeah I haven't looked into it much so to me it sounds like it's obviously a very good idea that like that would be much more uh successful but I think it takes a lot more engineering to actually Implement that idea than what was done here so I think this is just a first approach where like you're doing it the straightforward easy way but what you're describing is most likely way more powerful if anyone builds it let me know yeah what gaps in the paper I think I think everybody saw hybrid drag and everybody was like yeah let's do this one and it's fair enough but I feel like there was a lot of gaps on the paper so if you guys have any suggestions to what paper we should do next time please feel free to like email us and uh we're happy to go through it and see if there any G um regardless of whether it's about hybrid rag or anything that's trending but yeah this was a fun session there was a lot of diverse opinions does anybody have any thoughts you know about how's everybody feeling I had one last thought maybe about about the so we have like the documents in the vector store and then we have the graph but we could also like like if usually when I think of a graph I think of Wikipedia and then each entity has like a document as well like the Wikipedia article and uh yeah so you could have have the knowledge organiz in a graph somehow I don't know not a very concrete thought but it's one my last note you want to throw out there and otherwise just the hbd like H combining this is also this a bit of disappointment that didn't see some crazy new method but like graph methods and lless graphs can definitely be used to improve Rag and I think it sometimes like we often have hierarchically organized documents like uh document Tre that's a kind of a graph and I just look forward to actually trying out graph rag I haven't done it yet when you're talking about attaching documents to the nodes I don't know if that's like person necessary because I think you can model it really fully with the graph too but I feel like in if you go to Wikipedia or or Wiki data like there's also distinction between relations between entities and attribute of the entities and and I think you can model this with like being like oh this this person has an age and it has a relationship age to a number in theory you can kind of get around with just a pure graph but I don't see them talking about these attributes which are normally modeled more as something separately which is a little bit like you're saying this this like having this text next to each entity and and I I wonder if that would also improve it quite a bit having like explicitly modeling attributes as something more separated than than just relationships for sure awesome I guess we can wrap it up then right if anybody else has any thoughts please feel free to just unmute and just let us know or we always keep the conversation rolling in our slack like I I've put the link to John our slack workspace is in the chat so if you want to just join us and suggest what people we should cover next time or if you want to post one of these sessions like Matt Nill Sonam and wemar um just people to say hi to me and thank you so much for joining always fun to see people from all of from different time zones joining us and giving us all these diverse so stay tuned we will discuss a much cooler better paper next time and lovely to see you all here thank you so much

Original Description

Paper: HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction https://arxiv.org/abs/2408.04948 // Abstract In our September 12th MLOps Community Reading Group session, live-streamed at the Data Engineering for AI/ML Virtual Conference, we covered the paper "Hybrid RAG: Integrating Knowledge Graphs and Vector Retrieval for Information Extraction." The panel discussed using hybrid RAGs to pull information from unstructured financial documents. The paper's approach combines knowledge graphs with vector-based retrieval for better results. We critiqued the context-mixing methods and lack of graph pruning techniques. Overall, it was a solid session with great insights on improving financial data extraction using hybrid models. // Hosts Nehil Jain: Stealth AI Startup @ Co-Founder - https://www.linkedin.com/in/nehiljain/ Sonam Gupta: AICamp @ Chapter Lead - https://www.linkedin.com/in/sonamgupta11/ Matt Squire: Valdimar Eggertsson: AI Development Team Lead @ Snjallgögn (Smart Data inc.) - https://www.linkedin.com/in/valdimar-%C3%A1g%C3%BAst-eggertsson-8210236/ Moderator: Binoy Perera: Community Operations @ MLOps Community -https://www.linkedin.com/in/binoy-perera-811282204/ // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Subscribe to this calendar to receive updates about all future sessions- https://lu.ma/mlreadinggroup Info page: https://www.notion.so/mlops/MLOps-Community-Reading-Group-2764a02352734213b25af07e5f835d45 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Timestamps: [00:00] Hybrid rank combines know

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from MLOps.community · MLOps.community · 0 of 60

← Previous Next →

Our 1st MLOps Meetup // Luke Marsden // MLOps Meetup #1

Our 1st MLOps Meetup // Luke Marsden // MLOps Meetup #1

MLOps.community

Remote Collaboration as a Data Scientist

Remote Collaboration as a Data Scientist

MLOps.community

MLOps Manifesto with Luke Marsden from Dotscience

MLOps Manifesto with Luke Marsden from Dotscience

MLOps.community

MLOps lifecycle description

MLOps lifecycle description

MLOps.community

What Does Best in Class AI/ML Governance Look Like in Fin Services? // Charles Radclyffe // MLOps #2

What Does Best in Class AI/ML Governance Look Like in Fin Services? // Charles Radclyffe // MLOps #2

MLOps.community

Life purpose and too many spreadsheets

Life purpose and too many spreadsheets

MLOps.community

Explainability, Black boxes and EU white paper on reproducibility

Explainability, Black boxes and EU white paper on reproducibility

MLOps.community

Hierarchy of Machine Learning Needs // Phil Winder // MLOps Meetup #3

Hierarchy of Machine Learning Needs // Phil Winder // MLOps Meetup #3

MLOps.community

Automatically Retrain Machine Learning Models? Are best practices worth it?

Automatically Retrain Machine Learning Models? Are best practices worth it?

MLOps.community

Building an MLOps Team? Key ideas to keep in mind

Building an MLOps Team? Key ideas to keep in mind

MLOps.community

Hierarchy of MLOps Needs

Hierarchy of MLOps Needs

MLOps.community

Bare necessities for getting an ML model into production

Bare necessities for getting an ML model into production

MLOps.community

MLOps and Monitoring

MLOps and Monitoring

MLOps.community

How Phil Winder got into Data Science and Software Engineering

How Phil Winder got into Data Science and Software Engineering

MLOps.community

Provenance and Reproducibility in Machine Learning; what is it and why you need it?

Provenance and Reproducibility in Machine Learning; what is it and why you need it?

MLOps.community

Friction Between Data Scientists and Software Engineers

Friction Between Data Scientists and Software Engineers

MLOps.community

MLOps Problems in different size companies

MLOps Problems in different size companies

MLOps.community

ML tooling in large companies

ML tooling in large companies

MLOps.community

ML Platforms - The build vs buy question

ML Platforms - The build vs buy question

MLOps.community

ML Services Gateway at SurveyMonkey

ML Services Gateway at SurveyMonkey

MLOps.community

Message buses, Async and sync architecture

Message buses, Async and sync architecture

MLOps.community

MLOps #4: Shubhi Jain - Building an ML Platform @SurveyMonkey

MLOps #4: Shubhi Jain - Building an ML Platform @SurveyMonkey

MLOps.community

Hybrid Data Science Teams @SurveyMonkey

Hybrid Data Science Teams @SurveyMonkey

MLOps.community

How do you handle ML version control at SurveyMonkey

How do you handle ML version control at SurveyMonkey

MLOps.community

Doing ML with Personal Information

Doing ML with Personal Information

MLOps.community

Evolution of the ML feature store @SurveyMonkey

Evolution of the ML feature store @SurveyMonkey

MLOps.community

Developing a Machine Learning Feature Store

Developing a Machine Learning Feature Store

MLOps.community

Auto retrain ML models is not the question

Auto retrain ML models is not the question

MLOps.community

3 key parts to Machine Learning monitoring

3 key parts to Machine Learning monitoring

MLOps.community

MLOps Meetup #6: Mid-Scale Production Feature Engineering with Dr. Venkata Pingali

MLOps Meetup #6: Mid-Scale Production Feature Engineering with Dr. Venkata Pingali

MLOps.community

MLOps meetup #5 High Stakes ML: Active Failures, Latent Factors with Flavio Clesio

MLOps meetup #5 High Stakes ML: Active Failures, Latent Factors with Flavio Clesio

MLOps.community

MLOps: Airflow Pros and Cons

MLOps: Airflow Pros and Cons

MLOps.community

Specific challenges in Machine Learning

Specific challenges in Machine Learning

MLOps.community

Current State Of Machine Learning

Current State Of Machine Learning

MLOps.community

Humans in the Loop are a defining factor in Machine Learning

Humans in the Loop are a defining factor in Machine Learning

MLOps.community

Learning from real life Machine Learning failures

Learning from real life Machine Learning failures

MLOps.community

Survivorship Bias in machine learning tutorials

Survivorship Bias in machine learning tutorials

MLOps.community

Swiss Cheese model in Machine Learning

Swiss Cheese model in Machine Learning

MLOps.community

Resume driven development in Machine learning & software engineering

Resume driven development in Machine learning & software engineering

MLOps.community

Who has the highest standards in ML?

Who has the highest standards in ML?

MLOps.community

Venkata Pingali of Scribble Data Thoughts on the Current State of Machine Learning

Venkata Pingali of Scribble Data Thoughts on the Current State of Machine Learning

MLOps.community

Dependable data and being able to Trust in your Data with Venkata Pengali of Scribble Data

Dependable data and being able to Trust in your Data with Venkata Pengali of Scribble Data

MLOps.community

Speed, Trust, Evolution and Scale in MLOps

Speed, Trust, Evolution and Scale in MLOps

MLOps.community

More difficult transition for data scientists to become ML engineers

More difficult transition for data scientists to become ML engineers

MLOps.community

How many models in prod til I need a dedicated ML platform?

How many models in prod til I need a dedicated ML platform?

MLOps.community

Deeper thinking from data scientists around platform blackholes

Deeper thinking from data scientists around platform blackholes

MLOps.community

Checkpointing, metadata, and confidence in your data

Checkpointing, metadata, and confidence in your data

MLOps.community

Adjacent usecases and multistep feature engineering

Adjacent usecases and multistep feature engineering

MLOps.community

Standardization of Machine Learning tools like in Software Engineering with Venkata Pingali

Standardization of Machine Learning tools like in Software Engineering with Venkata Pingali

MLOps.community

Reproducability flaws in end to end Machine Learning debugging

Reproducability flaws in end to end Machine Learning debugging

MLOps.community

3rd wave of data scientists

3rd wave of data scientists

MLOps.community

MLOps meetup #7 Alex Spanos // TrueLayer 's MLOps Pipeline

MLOps meetup #7 Alex Spanos // TrueLayer 's MLOps Pipeline

MLOps.community

MLOps Meetup #8 Optimizing Your ML Workflow with Kubeflow 1.0

MLOps Meetup #8 Optimizing Your ML Workflow with Kubeflow 1.0

MLOps.community

Are Kubeflow and Airflow complementary?

Are Kubeflow and Airflow complementary?

MLOps.community

Why Kubeflow gained so much traction=open community

Why Kubeflow gained so much traction=open community

MLOps.community

Who decides the dirrection of Kubeflow

Who decides the dirrection of Kubeflow

MLOps.community

What do Kubeflow and Arrikto do and how do they work together?

What do Kubeflow and Arrikto do and how do they work together?

MLOps.community

Versioning your ML steps with Kubeflow

Versioning your ML steps with Kubeflow

MLOps.community

Machine Learning Lifecycles//Perception vs Reality

Machine Learning Lifecycles//Perception vs Reality

MLOps.community

Kubeflow vs SageMaker in Machine Learning

Kubeflow vs SageMaker in Machine Learning

MLOps.community

This video teaches how to integrate knowledge graphs and vector RAG for efficient information extraction, using tools like Vector RAG, Knowledge Graphs, and LLMs, and techniques like fine-tuning and hybrid models. The video discusses the challenges of extracting insights from unstructured documents and how HybridRAG can improve precision and recall in RAG systems.

Key Takeaways

Extract subgraph from full LGE graph
Put subgraph into language model
Combine traditional RAG and graph RAG into same context
Measure faithfulness of generated answers with automatic metrics
Construct knowledge graph from financial documents
Create data set for Vector RAG and GRAB-RACK comparison
Use GPT-3.5 for evaluation and retrieval

💡 The integration of knowledge graphs and vector RAG can improve precision and recall in RAG systems, and HybridRAG is a novel method for efficient information extraction.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related Reads

On July 1, 2026, arXiv will spin out from Cornell University, its home for the past 25 years, to become an independent nonprofit organization. Major funding support from Simons Foundation and Schmidt Sciences. Ditching the red for their website. [N]

arXiv is becoming an independent nonprofit organization after 25 years at Cornell University, backed by major funding, which will impact the future of research and academia

Reddit r/MachineLearning

CS-NRRM™ Official Publications: Paper 1 and Paper 2 Are Now Available

Learn about the CS-NRRM's official publications on a 12-year longitudinal human observation archive and its significance in research and development

Medium · Data Science

Found a potential mistake in an ICLR 2026 blogpost [D]

Verify a potential mistake in an ICLR 2026 blog post and learn how to effectively report errors in academic publications

Reddit r/MachineLearning

Rebuttals Move Peer-Review Scores, but Initial-Review Structure Bounds the Movement

Learn how author rebuttals impact peer-review scores and the factors that influence their effectiveness in ICLR 2024-2025, using LLMs for measurement

Understanding Roy's Adaptation Model (10 Minutes)

Microlearning Daily