Drasi - Change-Driven Data Processing

Microsoft Developer · Intermediate ·🛡️ AI Safety & Ethics ·5mo ago

Skills: AI Systems Design90%AI Alignment Basics80%AI Ethics & Policy70%

Key Takeaways

Drasi is a data change processing platform that detects changes in data across multiple systems and takes action in response, leveraging retrieval augmented generation and fine-tuning for AI safety.

Full Transcript

Hello everyone. Welcome back to the OP at Microsoft show. I'm here with Aman and Daniel. Hello. >> Hello. >> We're going to talk about changing changed driven architecture and data change processing platform. It's called Drazzi. Is a CNCF project. And if you like event driven platforms, I think that episode of the show will be very important for you. Ammon then you tell me in a sentence or explain what's drazi and what kind of problem that can solve for people watching this. >> Yeah. So uh I think Dassi is a data change processing platform. So in modern architectures uh we have systems that are distributed and then data is kind of spread across multiple different systems right and then frequently we run into problems where we want to detect changes happening in this data that is fair across multiple systems and when these changes happen we want to take an action in response to that so that is where uh we see grassi playing a role and yeah Daniel you want to add to that >> I'd maybe just say that um a lot of people think in terms of events uh where we think in terms of changes. So you really just it sort of hooks into the usually the CDC pipeline of a database uh and then we use a query language to describe the patterns that you're looking for the patterns of changes that you're looking for in your data um rather than trying to decipher very specific uh events. >> Yeah. And I think uh drast definitely provides a much simpler approach for building any kind of change-driven solutions uh compared to any of the traditional solutions that we have available today. >> That sounds exciting. I think a lot of people that going to see names like coffee and other products that people build all those event- driven platforms. But you want to share your screen, you want to share your slide maybe to understand the problem a little bit deeper for people to understand exactly how DR can can help them. >> Yeah. >> On that, you know, change driven less share. >> Yeah. >> So, welcome Drazzi for CNCF. I think it was wasn't long ago and we just had CubeCon Atlanta. We had a nice presentation there. I'm going to put the link on the on the video description. you can watch this session prom at sincere at cube Atlanta. So I really recommend that and um and also I just start to put the links here for the project.io and also on on GitHub so can start having a look and tell me go ahead ammon and start explaining you know what kind of problem we talking about here. Yeah. So uh before I start I want to mention that uh DRASI is completely open source and it's part of CNCF now. So the cloudnative computing foundation and it's a CNCF sandbox sandbox project and today let's take a look at how draasi can help us simplify changedriven scenarios right. So to begin with uh now that everyone everybody is excited about AI uh maybe you can start looking at uh chat applications right. So how do we build a basic chat application today? So we simply wrap the LL model uh behind some application code and then we can ask it questions right uh like what is the largest country and because the model has been trained on large amount of internet data it is able to give us a factual answer but what happens when we want to ask questions about our own data which may be sitting in a Postgress database or a MySQL database now let's say we have an online store where the products information is in Postgress and the reviews information is in MySQL so how do we uh ask question like what are the laptops in the store? Now the LLM here has no idea about the data that is sitting in our databases, right? So how do we bridge this gap? So we need to provide the data that is sitting in our databases to the LLM. And if we just directly ask the LM model about the data, it just gives us a canned response that I don't have access to your data. So we need to figure out some kind of solution for that, right? Uh one solution in use today is called retrieval augmented generation where we need to join the data uh coming in from our different sources and maybe create text documents from that and that text document can be converted into an embeddings into vector embeddings using an embedding model where the semantic meaning of the text has been converted into numerical vectors and that is then stored into a vector database and that's then used by the chat application to respond to the user's question. what about the uh about the laptops in the store. So the data application first queries the vector store gets the relevant documents and injects that into the prompt that is sent to the model right and that allows the model to then come up with the appropriate response which has the accurate information from both the sources uh like the products information and also the reviews information as we as we can see here. Now what happens if the product information changes right? So we see that uh the information that has been given by the uh AI agent is wrong right and even the rating can get out of out of date as new reviews keep getting added to the MySQL database. So what we really need here is a real-time rack system that the vector store as our data get keeps getting updated in the products and uh reviews table in MySQL and postgress. So this is the whole pipeline right? So how do we keep the vector store in sync uh in real time? Well, one option could be to just use plain old polling, right? Uh we can just schedule a batch job and that can read all the data from Postgress and MySQL. We can generate text documents using templates, use the embedding model to generate vectors and completely override the vector database. Now this is simple and it is guaranteed consistency but then we the results can be stale and it's also quite expensive to keep calling the embedding model on all of the data most of which probably has not changed right so uh how can we make this better well we can modify it a bit so that we track the state and compute only the updates right so that way uh whenever we see any data changes we can only compute the embeddings for those updates and update the relevant entries in the vector data that allows us to save the embedding costs from model but now finding the delta and tracking those updates is kind of complex right so and even with that uh we are again because of polling because of the nature of polling we are kind of increasing the load on our source databases and depending on the frequency of the polling interval uh either we get stale results or we get uh like a lot of load on our databases so how can we do better well The enterprise solution to this is kind of uh CDC plus streaming, right? So we can subscribe to the database change logs using tools like DBZM and publish those to uh like event broker like Kafka and then we can write stream processing jobs with technologies like Flink right where we join the two streams do some processing and the output uh comes to another kind of stream. Now that is real time but we still need more code to process the output stream and generate the text embeddings from that and then store that into the vector database. Now this architecture is definitely powerful. It is near real time and has no impact on the data sources. But the operational complexity is massive because now we need to manage uh deploy and manage the CDC connectors, Kafka cluster, Flink cluster and we are still not done right. So we need to write the separate pieces of custom code. One for the staple string processing job, another for the embedded pipeline. Now we have one option which was polling which was simple but it's kind of slow and streaming which is fast but quite complex. So what if uh there was a better way >> just just before you go on the better way. >> I I like that slide there. Um there is always a better idea but tell me like people watching this that they working like that already. So you know what's the effort to change that you know I think the main point is how is that going to simplify that scenario here? >> Yeah. >> And also the main difference that I see that the complex could be bigger once you have more s you have like two database there but could have five. So >> and then the complexity become you know even higher than um on this point is that the whole the first idea of the project was to solve these multiple source of of data. Yeah definitely. So uh one of the key selling points of drai is that uh it allows you to join uh data coming in from multiple different sources together into a single query in a into a deta query which we'll get to into a minute. But comparing it with existing uh CDC or streaming kind of architectures right so remember this uh architecture that I'm showing on my screen is kind of powerful and it's real time but in this we need to not not just deploy Flink or Kafka we need to constantly manage those right so that is an ongoing operational load on the team and uh even with these two deployed we always need to keep writing and maintaining custom code so for example for this rag example you may just write a single uh stream job. But what if you come up with a new scenario? Now you need to write another kind of custom code there. Right? So that is uh for any kind of stream processing job. Now these stream processing jobs are also quite complex. They may have like watermarks, windowing, you need to take care of like uh the backend indexes and all that. So all of that is quite complex and that is one part of the problem the complexity. And now the second part is also feasibility right. So uh not every environment is conducive to deploying uh technologies like Kafka or Flink. Some of them uh maybe uh don't really have uh capacity to run these kind of large systems, right? So which is where Rasi can really shine. >> So and as example we could say let's say Azure functions or Lambda they kind of get like a source but normally it's a single source >> and they have some logic and they can do some you know some reaction on that. Let's see the you know the DR architecture I think will be make more sense for what we discussing here that's exciting I really like that kind of subject have done those solutions that you show here many times for customers so let's go ahead and see what's what's best what's what can be done better now >> yeah so uh as we can see right uh on the in the pipeline that I'm showing on the right uh this is definitely working and it's pretty fast but let's see if we can make this better so what if I told you there's a better way of doing this, right? So what if this entire pipeline can be collapsed into a single component and that's how the simple the architecture becomes with Dassi where we don't need to keep writing custom code or even manage multiple systems. All we need a set of config files and drai handles the rest. So let's see how that works with drai. Now in our rag example we have data in postgress and my SQL. So we need to tell drai how to connect to our existing data sources. Right now remember uh with Dassi you don't need to copy your data into some centralized data lakeink or even into drai just need to tell rasi that this is where my data lives and rasi can meet your data where it lives today. So for that we first write this simple yaml file uh that tells rsi about its sources. So an example could be something like this. Now this is a pretty simple here right? So this is telling rsi about our postress database and it has some basic connection details like the database name the table name pretty standard stuff. Now with that applied, Dassi gets connected to the Postgress database and starts watching for changes happening in the data inside the Postgress database. Now we have another data source. So we would need a source YAML file for that as well and that will tell Dassi about the MySQL data source. Now with these two in place, Drai can handle the connection and the underlying change detection for both the data sources. Now once that is in place what DRSI allows you to do is uh join the data coming in from various disparate sources and allows you to treat all of that as though it was part of a single virtual graph and with that we can write graph queries like shown here. Now if you see the query here right so this is kind of uh this is pretty straightforward cipher kind of syntax which should be familiar to the audience in case they have dealt with uh Neo4j or any other kind of graph databases but uh this is a simple declarative query which does not care about the incoming streams the windows the joins and watermarks etc. is simply declaring the final joint result set that we want. Right? So for instance here we are declaring that we want the products associated with their reviews and then we are just returning some of the information about the products and computing some aggregate review statistics. Now trassy queries like the one that we just saw are a bit different uh compared to the normal databases queries right because uh Daniel do you want to chime in about like normal database queries and how grassy queries are a bit different than that? Yeah. So, uh you think of a normal database query. It runs once. It goes looks at the storage, fetches your data, returns result, and then it's done. Um the continuous query is is constantly running. You can almost think of it as like a more of a diff engine. So, it fires up initially. It grabs an initial data. Um and what it outputs is not query results, but it's diffs. And from those diffs, you can then actually build up the current view of the current result set, which is what we do as well. So and then every time you do an update or delete in your source database uh that's going to run through the query the cipher query which will evaluate it and determine what the difference to the results it should be based on the change in the source database and output that diff and then something downstream then can then use that to to build a view or run an action or run some code. >> Yeah. Uh and just to add to that right so drastici queries uh in comparison to the regular database queries are called continuous queries because they kind of maintain the perpetually accurate result set as Daniel said. So it kind of drai will watch your data sources for incremental changes and keep the query result set always up to date in near real time. But the question is what can we do with this result set right? So once we have that result set available to us, so let's go ahead and deploy this uh query ammo to get the graph query. Uh and then the final piece of config tells drai about the actions that we want to take with the query result set and when they change. So that is the reaction yaml file that we can use for our rag scenario which is telling drai about the local vector store that we want to use. In this case, I'm using a quadrant vector store and I'm also telling Graasi about the embedding model which is hosted in Azure OpenAI and uh yeah and all the other text templates can be stored inside this YAML file and with that applied grassy can handle the rest. It handles everything end to end. So for example, it will take care of the document generation using templates, embedding generation using the models and the updates to the vector store. Now let's take a look at the real demo here. So here I have a MySQL database with reviews information and a Postgress database with products information and I also have a quadrant vector store running. Let's jump inside the Postgress database where we have the products table containing some electronic items like uh phones and laptops and inside the MySQL database there is a reviews table that has reviews from the from the same products that we saw earlier. Now in the quadrant vector store we can use the curl command to investigate the contents of the collection. So I have written a simple script here that will fetch the product knowledge collection from the vector store and if found it will report back the total number of points and the most recently updated ones in the collection. So all of this is wrapped inside a a watch command. So that will refresh the results every 2 seconds. So let's go ahead and use this script to watch the vector store. So right now we see that the product knowledge collection does not exist. So we can use drai to build that. So we start by going to the drai website and download the CLI from there. And once we have installed the drai cla using the simple command we can use the drai inip command to install dassi onto our kubernetes cluster. And with this simple command drai is now up and running in a separate name space. So now let's take a look at the first set of yaml file for that tells drai about our sources. So here we are telling Dassi about the postgress source. And this is pretty similar to what we saw on the slides earlier. We are telling about the table name etc. And the second one here is the uh YAML file for MySQL data source. So let's use the DRSI CLI to apply these two to DRSI. And this is pretty similar to cubectl apply. And once those two sources are available, these are ready to for us to use in our query. So then we can go ahead into our query. So this is where we will define the logic for the query. So first we uh tell that we want to use the products information coming in from the Postgress database. Now remember the Postgress database is relational in nature whereas queries are graph in nature. So what will happen here is we will get uh a graph node for every single product row inside the Postgress table and similarly we will have a node inside Rasi's graph index for every review found in the MySQL table. And in the join section of the AML file, we can tell DRSi how to connect data coming in from two different sources. So here we are establishing a has review relationship between the products and their respective reviews. And we do tell rassi that this relationship exists between the product node with IDB and all the review nodes with the same value B in their product ID field. And with that graph model in place, we can write a simple query like this. So here we have a match phrase that captures all the products and their connected review nodes together and then we just project out some of the fields and product nodes computing some of the aggregates over reviews. Here we are computing the average rating and the number of reviews. Now let's go ahead and deploy this query using the same drastic CLI. Now when the query first gets deployed, it goes into bootstrapping where the indexes are being built and we wait for that to complete. And once it's up and running, we can use the grass CLI to watch the live result set. And that's the live result set that we want. Now we want to persist this or materialize this into a vector store, right? So let's use this final piece of config to tell RSI what to do with the result set. So this is the simple document template that I'll be using and RSI will fill in the values from every single result of that query. So it'll fill in the product name, category, etc. All of that information and then it will send off all that information to Azure OpenAI models. Like here I'm using the text embedding 3 large model to generate embeddings from th those text documents and I'm using the secrets here which are just regular Kubernetes secrets and finally the embeddings will get stored into the quadrant vector store. So I just provide a connection string for that. Now as soon as I apply this, if you watch on the left side of the screen, we see that in real time the product knowledge collection gets built and it gets populated with nine data points from the Postgress and MySQL table. And that's the real time uh solution with Drai. Now I built this simple net app which can which is connected to the quadrant vector store. So here we can ask it questions about the data that we have. Right? So here we just ask about the laptops in the store. We see that we just see the two laptops and this is being based on the three relevant documents that the chat application found inside the vector store. So the first one is tech prox1 and the second one is a fictional laptop ultrabookare and we also see some aggregated review statistics and we can always go back to the postgress database and confirm that these were the exact two laptops in our products table and we can see that the there were two reviews for the first one and only one review for the second one in the MySQL table. Now that is pretty good but what about uh real-time updates. So now let me just connect to the Postgress database here and also to my MySQL and let's try to add a new laptop. And on the left side we are watching the vector store collection in real time. And as we can see as soon as I hit enter we see in real time the collection the 10th collection gets 10th uh item gets added to the collection for the new laptop called gamerx elite. And if we go back to our net application and ask the question about laptops again, we see that the third laptop appears in the result and it also has the two reviews which were 515 rating. So that shows us like the real-time solution built by Tracy. So this is the kind of architecture uh simple architecture that we just used for building a real-time rack pipeline. Now here we see the vector store is just another form of a derived view right and that need for derived view is pretty common today uh in modern systems because the data is often scattered across multiple systems and you often need a derive view just like the one that we saw for a vector store now I want to clarify that vector store is just one example of a derive view right so another one could be let's say a real-time dashboard maybe you want to build a reactbased web-based dashboard or or maybe a viewbased dashboard. Now the dashboard itself may have data coming in from multiple different systems perhaps from a products microser, a reviews microser, an inventory microser and if you want to build such a system today you would need to fetch that information in real time combine that and surface it through a web soocket. Now with Dassi uh because we have a signal R reaction ready for you to use with a couple of config files you can just have a custom websocket up and running in minutes and then you can just focus on your uh front end. Another example for maintaining derive views is the CQRS pattern in microservices architectures that is pretty common. So in this example the read models are also just another form of derive views. Now this whole pattern of maintaining derive views uh is just one example of a broader category of problems in software design right what what we call changedriven architectures. Now we define change-driven architectures as those where a system must perform a timely action in response to critical data changes and that is a core problem that we have set out to solve at Trasi where we want to provide a new uh and simpler way of building these kinds of systems. Now we have like a bunch of different uh capabilities in Trai. Uh one of which we saw just now about uh the ability to maintain derived views. Drai can also trigger real-time actions whenever your result sets change. For example, you can send a message to Slack. You can initiate a new workflow. But Dassi also has some unique capabilities that Daniel uh will tell us more about. >> Yeah, sure. Do you mind if I start the screen for you there? Just before you start D, let me I can see a lot of um a lot of points that I want to you know just just do like a summary here that I think was very important is the first thing that flexibility on the source. The second one is you said like the CIS that's very hard to maintain that really like read view of you know of the data. >> Mhm. on the platform. also the the signal integration like you're pretty much going to do that very easily with DR uh like I said the dashboard I really like that one I think it's a very important point um but I think haven't show me I don't know if Daniel is going to show you know the syntax of the the query and really the components of Drazzi I think it would be important if we show that >> yeah just going to dive into that >> how flexible is that for someone that want to extend Drazzi you want to create new source, you want to create new reactions and and how really that works. >> Yeah. Uh we have like SDKs available to write to your own sources as well. So we already have a bunch of different wide variety of sources built out of the box, but we also have SDKs in multiple languages actually. So you can always build your own custom source within a day or so. >> I don't think we have show the slide showing you know sources, radials and reactions. If you're going to show that. >> Yeah, I'll come to that. All right. I'm gonna steal the screen real quick. All right. Is it working? There we go. Okay. So, one of the neat features of Jassie is not only the ability to detect change, but detect the absence of change. Um, so the example that we like to use with this is imagine you have a bunch of freezers and you have an IoT monitoring system for them and when they go above 32 degrees, it's fine if they go above 32 degrees for like a minute or two, but if they're sustained above 32 degrees, let's say for 10 minutes, then you want some kind of alert. You need to go and do something, right? >> Yeah. So, let's say there's a warehouse scenario where let's say I go and open a open the freezer door just to take out some iron. the temperature might fluctuate a bit. Right? So, we don't want to take an action based on that. But we want to take an action when the freezer stays open for a significant amount of time. Maybe the temperature has been stays elevated for a significant amount of time. Right. >> Right. And you can imagine, you know, the temperature goes up. Um you might get some sort of call it an event or a change or your data changes, you something that you can trigger an action on. Right. So, it goes from from 28 to 35, you can trigger an action. But if it goes to 35 and it stays there for 15 minutes, there's nothing to trigger anything off, right? So that that's the problem. >> So for for the purpose of this demo, I have here just a simple Postgress table. If you guys can see this, a Postgress table called freezer. I've got three freezers, one, two, and three. Uh, and a temp column. So we have 22, 35, and 39. Um, I have also already set up a dressy environment here. So this is also our uh we have VS code plugin. We can see all the sort of sources and queries interactions I have deployed to my Jurassic cluster. >> If you can do like a control plus just to zoom in a little bit. >> Yep. Oops. Was too much I think. There we go. >> Oh, this is a Windows machine. Sorry. >> Oh, sorry. Yeah, this is a control plus. >> There we go. >> One more. >> How's that? Yep, sounds good. >> All right, so uh what I'm going to do is just jump back to So I've already deployed some sources there. I've connected to that Postgress table that I showed you. Now, uh this is the query that we're going to have a look at here. So um this is just basically saying I'm connecting to the my name source. Uh but this is the interesting bit. This is the cipher that's sort of expressing the pattern that we're looking for. So we're getting the freezer which is just mapping directly to that Postgress freezer table. But we have this function and we've got three different functions. We call them future functions um for detecting the absence of change. And what this true for function really does is the first parameter is an expression. And what has to happen is this expression has to hold true for a specific duration which is the second parameter. So only once the temperature is above 32 degrees sustained for 10 seconds in this case. So for the purpose of of the demo we're changing 10 minutes to 10 seconds. Um only then will it emit a result. Then will the wear clause resolve to true and we'll actually get a result right. Um so you can imagine let's say it goes from 28 to 35 uh and 5 seconds later it goes back down to 28. This will never emit any change. But if it stays at 35 for 10 plus seconds, we should see a change. Um, so what I'll do is part of our uh VS Code plugin is we're able to debug queries without deploying reactions and it will show us the live materialized result set of this query. So I'm going to hit that and we can see we have freezer two and three already there and we'll just go to my postcrist table here. Let's just resize everything so we can see. So, what I'm going to do is we have freezer 2, which is at 35 degrees. I'm going to update that to 25. And when I commit this, it should disappear immediately. Right? There we go. It's gone. Um, and that's because that weight clause would return to true, resolve true anyway. Um, but if I updated it from 25 to 35 and I hit commit, we see nothing happens. that if we if we wait 10 seconds and it stayed at 35 for more than 10 seconds, then we should see it appear. And and there it is. And again, if I if I take it back down to 25, it'll it'll disappear. And if I put it back up to 35 again, we have to wait 10 seconds. But if I interrupt that and take it back down to 25 before the 10 seconds elapsed, it'll it'll never emit a result, right? Yeah. Uh, and Daniel, can you go back to the query once? >> Yes, I can. >> So, uh, as you see, George, uh, the the ma the query from line number 10 to 15, uh, is just a simple match phrase with a wear clause and a return clause, right? So, this is pretty standard in terms of if you're dealing with any kind of graph databases. The only extension that we have done here is providing you custom functions that are uh, enabling newer functionality here. Right? The true four function on line 12 here is a is a future function that drai adds on top of regular cipher and enl would you want to briefly mention there's a wide variety of uh extra functionality that we have added into trassi right >> yeah so the suite of functions we call the future functions we have the true for which you're seeing here which means the condition has to hold true sustained for a specific duration um we also have a true later which which you gives a future time stamp and a condition and at that time stamp it'll evaluate that condition. Uh and the other one is I think there's a true and um which is similar to the true for but it takes a um a target time stamp rather than a duration. >> Can can that be extended? Can someone create you know functions or or that's part of DAZI itself? >> Yes. So we do have internally we have a function registry where new functions can be loaded. We have not yet uh sort of surfaced any uh way to plug new functions in but that is on our road map. >> Okay. Okay. But internally it works. I understand. And just make sure people understand that if you're not used of Kubernetes like D is installed on top of Kubernetes there are some depends on deer. I know. And also if you are looking that a custom resource definition from is a CR of Drazi for Kubernetes. Is that correct? Everything I said is fine. >> It's actually not a custom CRD. Um it's our own YAML. Um it is >> the own. Okay, that's good. >> Um, and we use it's the very similar experience to using the coupube CLI. So, you have these YAML files. They look very similar. They're not CRODs. And you'd go drastically apply and point to the YML file and it's the same experience. >> So, that's good. That's so don't be worry resource not going to be stored on the Kubernetes database. It be stored by DR itself. >> Yeah. and and one of the reasons why we have this own separate CLI is because we do plan to support DRSI outside of Kubernetes as well. So today uh DRIS is a platform on top of Kubernetes but we also want to offer DRSI as a standalone server process and we also want to provide it as in an embeddible library form. So we are trying to build whatever people are asking for. So yeah we are actively working towards those goals. >> Yeah that that's funny just just like there's another project that I work called Brigade that was exactly the same way. There's not CD exactly for the same reason and nice like nice to see bring you know Dazi bring back a lot of those ideas and in a more you know modern way. >> Yeah. Uh yeah and then I think Trai also has like a role to play um in AI envir like upcoming and new AI based areas right. So just like we saw earlier about a rack scenario so Daniel here has been uh working a bit with the line chain community as well. So uh then do you want to talk a bit about that? >> Yeah. So I think one of the opportunities we see with the emergence of agentic workflows is you know everyone's been focused on on chats this interactive synchronous workflow back and forth with the agent. But we we must have agents running in the background uh all the time and I think Langchain has coined the term ambient agents and we see an opportunity for drastic to be the change detection infrastructure for these ambient agents right and so what we've done is we've built a um a native lang chain extension library which bridges the JRassi functionality um into a lchain or a l a langraph workflow. Uh, and so we have a demo to show you around that. Um, and what it's going to be is a two-dimensional game, and we'll send you a link. We can all play together in a moment. Uh, but basically, there's a a two-dimensional um, map, and there'll be AI players, and there'll be human players. And when you move around on the board, uh, what it's all it's doing is updating your coordinates in a Postgress database. Um so that that's the this is the YAML basically here for the source to the Postgress database uh that we applied to Dassi. Um and then we have two queries. So the way it's going to work is the AI players can't actually see where you are. The only information they get is by subscribing to these queries. Um and I'll explain to you how that works, all the mechanics in a second. But uh to make the game play interesting um they're only going to be able to see players if they stay in the same position for more than 3 seconds. So that's what we're doing here with this query using the true later function. Uh basically this will emit uh results when a player has been stationary for 3 seconds or more. Um and the second uh one will be uh if the player moves more than once in less than one second. So, if you move too slow or you move too quick, then the AI players are going to see you and they're going to come in and get you. Um, >> right. >> Um, >> it would be nice. You could you could you could create different behaviors there like for your game like you could do different things. >> Yeah. And so the way this all fits together is uh we have an MCP reaction which basically takes these two queries and exposes them uh as an MCP server as MCP resources. So part of the MCP spec is this concept of resources which can send asynchronous notifications uh back to the the uh client endpoint. Uh so here we've configured a MCP reaction uh for each query. There's the idle players query and there's a fast players query. And basically it has a description. So when the I'll show you the workflow now, but when it starts up, it discovers all the queries available to it by reading these text descriptions and decides which ones it wants to subscribe to. And then it will then subscribe to the ones it's interested in. And whenever um a result is added for example to the idle players uh result set here uh this text is going to be emitted um by the MCP server and it's going to feed into the the land graph workflow. Uh and then the um the AI player can then use that information to re-evaluate its plan as to where it's moving to. Right? Uh and then we have these different um so these are just handlebars templates where it constructs the text uh that get generated every time there's a change. So, it's saying, you know, the player ID was moving fast to position X, comma, Y, or if it was an update, you can get the the after and the before positions, and I can use that information to plan where it's going to move to. Um, all right. So, those are all the pieces. We have the the Postgress source, the the two queries, and you can swap those out um as you like to change the dynamics of the game. And then we have the MCP reaction. Um and this all connects to our uh our lang chain uh Jassie library which understands how to talk to the MCP server and set up those subscriptions. And the workflow for that uh for the LAN graph workflow looks something like this. So it's going to start up uh there's a prompt to tell it hey there's some some queries that you need to go and have a look at and decide which ones you want. Uh and then the that sends that to the model. The model calls a tool which discovers the queries and then it individually subscribes to each one if it's interested in them. Once the setup phase is complete, it moves to the second phase which is uh runs infinitely where it first checks its senses. So um basically what's happening in the background is every time the NCP server emits something it's getting buffered uh and this check sensors uh node in the workflow will basically uh unload that buffer uh into the agent's memory and then if there's something new it will reate the planate the targets pick which target it's going for and decide which route it's going to do and then it'll execute that move and then loops back if there's no new data from from the sensors is and then it keeps looping around and executing its move until it reaches its target. >> Does that all kind of make sense? >> Yeah, makes sense for me and yeah, it is a little bit complex for someone that never build you know AI agents or book with MCP but it would be makes sense if someone watching you know in six months time because >> those things are getting more popular. I think it's nice to to have this here nice. So, I'm going to I'm going to send you this link where we can all join the game. Uh, one sec. Oops. And so, we're going to I'm going to join the game. And we don't have any agents, any um AI players at the moment. I can see you there. So, I'm going to fire up one or two. Um, >> so I just joined as a player in the uh one of the players as well into the same game. >> He joins. >> I'm going to I'm creating spinning up some background agents. Uh, so if we jump here. Whoops. They're gone. Crashed. Okay, looks like our demo is broken. Yeah, but the point is right so uh the game is uh just one example of how we can this can be used in real time agents right so you can have uh any kind of a chat in that's constantly running and monitoring some kind of data system and whenever the relevant change happens or maybe the absence of change condition happens that's when rasi can take an action for you uh in this case it would be just the the kind of AI the player chasing you in the virtual game and I think it's working now so let's try um expose the port for the HTCP server. >> So the red line was very quick. No. >> Yeah. The red ones are the AI players and if you move fast enough, they'll notice your position and then they'll try to run after you and try to catch you. So I think this one is really coming behind me. >> Yeah, >> that is scary. Oh, looks like I just got caught. >> So, I don't think it it um works too well when you have lots of targets for it to go through. It can't pick which is the best one. >> Yeah, >> you get the general idea. >> The whole idea is to be, you know, give information for the eye to to try to find ads. And in this game like uh if you notice the red dots have no idea of where the blue dots are, right? So they only know when we move and when we update the database. That's when Trai comes in and emits those signals just feeding the information into the red dots. >> That can be used for a lot of you know monitoring and real time >> um solutions. And that's great. That's great. >> Yeah. Not sure if gonna you want to go and show another demo or just stop for here for now and um >> yeah so we can kind of uh wrap up and uh maybe like I'll quickly touch on a bit about drasti so as you can see uh like as we've seen with couple of examples here right so we saw the earlier the brag example we've seen the absence of change example where we were discussing about the freezers and then we've also seen uh the the hide-and-seek kind of game here right so we've seen that dasi is really providing a really simpler and declarative approach here, right? So, which is simplifying building all these kind of change-driven architectures. Now, within DRSi, we just have like sources, reactions and the continuous queries. Right? Now, what we need is that what we need to know is that Drai does not force you into a big data migration project. You don't need to copy all of your data into DRSI or any other central place. It meets your data where it is and it provides us with a declarative approach right so you can write declarative queries which joins heterogeneous sources so we do support out of the box multiple different data systems like uh postgress my SQL SQL server and even non- database systems like kubernetes which has some stateful information right so let's say you have pods and containers running which are also changing all the time so you want to monitor that you can use rsi for that and we have a bunch of sources today already built for you and we also have SDKs as I said in multiple languages for you to build your own uh sources. Now that is one part of the equation and on the other side we rassi also provides you versatile reactions right so these allow you to act in a wide variety of external systems now these allow you to build new scenarios without the headache of maintaining custom code and we also provide a bunch of reactions out of the box and we also provide SDKs in multiple languages for you to build your own reactions. So we have one for quadrant that we used for our rag example. We have Amazon event bridge. We have signal R. We have Azure event grid, Microsoft data versse etc. So yeah uh we have this new new versioning ecosystem of uh both sources and reactions. And yeah that's pretty much what we had for Dassi. So yeah Dassi is open source. We would uh love for all of you to join us in the endeavor of simplifying changedriven architectures and maybe provide your feedback like uh wherever you want to use Rasi. Our team is always ready to help uh maybe help you build your own source or reaction. Yeah. File issues into our GitHub. Give us a star. I think that's great. That's similar to what DeA project did. You know, building more source and reactions will be I think the way to go. And just to make clear for everyone, the project still early stage. Is that not is that recommended for production or not yet? >> Yeah, so we just uh went open source like we just launched late last year. So it's just been one year for us uh and we are pretty new. Uh so we are still in our early stages and as even in the CNCF we got inducted into the sandbox. So we are making some progress there. But in terms of production readiness, Daniel want to chime in a bit. >> Yeah, it's something we're still focusing on. Uh we got a bit of work to do I think before we at production ready but we're definitely seeking out anyone who wants to build a proof of concept with us and we'd love to work with you on that as well if you reach out. >> Yeah and uh we would love to hear from folks uh what it takes for people to use this in production. Right. So whatever the missing pieces are we are always willing to build that. Yeah. Yeah. Yeah. That's that's created. That's the point, you know. Follow the project. Leave a staff for the project if you like it if falling. Also follow, you know, the opate Microsoft show. We're going to bring more content that could be using DR as well. Could be bringing the DR team again to see all you know the new features that they're going to be building over the next few months. So I think it's a good time uh am to join the project to start contributing and you know helping the project to improve and um we had a nice session that I said at keepcom there is a lot of resource I'm going to leave on the video description so you can follow there's also the YouTube channel where can we can get the content and hope you see you guys soon. Thank you everybody. Thank you. Thanks. Thanks.

Original Description

In this episode, we’ll deep dive into Drasi, a new data processing system that simplifies detecting critical events within complex infrastructures and taking immediate action tuned to business objectives. Developers and software architects can leverage its capabilities across event-driven scenarios, whether working on Internet of Things (IoT) integrations, enhancing security protocols, or managing sophisticated applications. ✅ Resources: Drasi https://drasi.io/ Source code https://github.com/drasi-project 📌 Let's connect: Aman Singh | https://www.linkedin.com/in/amansinghoriginal Daniel Gerlag | https://www.linkedin.com/in/daniel-gerlag Jorge Arteiro | https://www.linkedin.com/in/jorgearteiro Subscribe to the Open at Microsoft: https://aka.ms/OpenAtMicrosoft Open at Microsoft Playlist: https://aka.ms/OpenAtMicrosoftPlaylist 📝Submit Your OSS Project for Open at Microsoft https://aka.ms/OpenAtMsCFP New episode on Tuesdays!

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Microsoft Developer · Microsoft Developer · 0 of 60

← Previous Next →

Prepare for the DP-300 exam & the Azure Database Administrator Associate cert | Data Exposed

Prepare for the DP-300 exam & the Azure Database Administrator Associate cert | Data Exposed

Microsoft Developer

What I Wish I Knew ... about landing a job in tech

What I Wish I Knew ... about landing a job in tech

Microsoft Developer

Igniting Developer Innovation with Vector Search

Igniting Developer Innovation with Vector Search

Microsoft Developer

Combining the power of vector search with Azure OpenAI then revolutionize image search with vectors!

Combining the power of vector search with Azure OpenAI then revolutionize image search with vectors!

Microsoft Developer

What I Wish I Knew ... about finding your place in tech

What I Wish I Knew ... about finding your place in tech

Microsoft Developer

Fluent UI React Insights: Accessible by default

Fluent UI React Insights: Accessible by default

Microsoft Developer

Signing Container Images with Notary Project

Signing Container Images with Notary Project

Microsoft Developer

What I Wish I Knew ... about finding your place in tech

What I Wish I Knew ... about finding your place in tech

Microsoft Developer

What programming languages does GitHub Copilot support?

What programming languages does GitHub Copilot support?

Microsoft Developer

What I Wish I Knew ... about how much your job can change

What I Wish I Knew ... about how much your job can change

Microsoft Developer

What I Wish I Knew ... about how much your job can change

What I Wish I Knew ... about how much your job can change

Microsoft Developer

How do I become more confident about AI?

How do I become more confident about AI?

Microsoft Developer

How do I become more confident about AI?

How do I become more confident about AI?

Microsoft Developer

Performance Demos of SQL’s Intelligent Query Processing Feedback capabilities | Data Exposed

Performance Demos of SQL’s Intelligent Query Processing Feedback capabilities | Data Exposed

Microsoft Developer

What I Wish I Knew ... about coming to Microsoft

What I Wish I Knew ... about coming to Microsoft

Microsoft Developer

What I Wish I Knew ... about coming to Microsoft

What I Wish I Knew ... about coming to Microsoft

Microsoft Developer

Revolutionizing Image Search with Vectors

Revolutionizing Image Search with Vectors

Microsoft Developer

Igniting developer innovation with Vector search and Azure OpenAI

Igniting developer innovation with Vector search and Azure OpenAI

Microsoft Developer

Getting Started with Azure AI Studio's Prompt Flow - Part 2

Getting Started with Azure AI Studio's Prompt Flow - Part 2

Microsoft Developer

What I Wish I Knew ... about finding my career path

What I Wish I Knew ... about finding my career path

Microsoft Developer

What I Wish I Knew ... about finding my career path

What I Wish I Knew ... about finding my career path

Microsoft Developer

Windows Terminal's journey to Open Source

Windows Terminal's journey to Open Source

Microsoft Developer

Can I trust the code that GitHub Copilot generates?

Can I trust the code that GitHub Copilot generates?

Microsoft Developer

What I Wish I Knew ... about interviewing

What I Wish I Knew ... about interviewing

Microsoft Developer

What I Wish I Knew ... about interviewing

What I Wish I Knew ... about interviewing

Microsoft Developer

What is the Microsoft TechSpark Program?

What is the Microsoft TechSpark Program?

Microsoft Developer

SQL Server 2022: Accelerate query performance while reducing query compile time - w/ no code changes

SQL Server 2022: Accelerate query performance while reducing query compile time - w/ no code changes

Microsoft Developer

What I Wish I Knew ... about discovering computer science

What I Wish I Knew ... about discovering computer science

Microsoft Developer

What I Wish I Knew ... about discovering computer science

What I Wish I Knew ... about discovering computer science

Microsoft Developer

Call center transcription and analysis using Azure AI

Call center transcription and analysis using Azure AI

Microsoft Developer

How to use Text Analytics for health in Azure AI Language

How to use Text Analytics for health in Azure AI Language

Microsoft Developer

Azure OpenAI-powered summarization in Azure AI Language

Azure OpenAI-powered summarization in Azure AI Language

Microsoft Developer

Accelerate data labeling using Azure OpenAI and Azure AI Language

Accelerate data labeling using Azure OpenAI and Azure AI Language

Microsoft Developer

Building a Private ChatGPT with Azure OpenAI

Building a Private ChatGPT with Azure OpenAI

Microsoft Developer

What I Wish I Knew ... about how to interview

What I Wish I Knew ... about how to interview

Microsoft Developer

What I Wish I Knew ... about how to interview

What I Wish I Knew ... about how to interview

Microsoft Developer

Getting Started with Azure AI Studio's Prompt Flow - Part 3

Getting Started with Azure AI Studio's Prompt Flow - Part 3

Microsoft Developer

Intelligent Apps with Azure Kubernetes Service (AKS)

Intelligent Apps with Azure Kubernetes Service (AKS)

Microsoft Developer

Getting Started with Azure Blob Storage | Data Exposed: MVP Edition

Getting Started with Azure Blob Storage | Data Exposed: MVP Edition

Microsoft Developer

Chat + Your Data + Plugins

Chat + Your Data + Plugins

Microsoft Developer

What I Wish I Knew ... about different career paths

What I Wish I Knew ... about different career paths

Microsoft Developer

What I Wish I Knew ... about different career paths

What I Wish I Knew ... about different career paths

Microsoft Developer

Advanced Dev Tunnels Features | OD122

Advanced Dev Tunnels Features | OD122

Microsoft Developer

Learn Live - Manage performance and availability in Azure Cosmos DB for PostgreSQL

Learn Live - Manage performance and availability in Azure Cosmos DB for PostgreSQL

Microsoft Developer

Plan your SQL Migration to Azure with confidence | Data Exposed

Plan your SQL Migration to Azure with confidence | Data Exposed

Microsoft Developer

What I Wish I Knew ... about social skills in a tech career

What I Wish I Knew ... about social skills in a tech career

Microsoft Developer

What I Wish I Knew ... about social skills in a tech career

What I Wish I Knew ... about social skills in a tech career

Microsoft Developer

All About Vectors, Search, and Function Calling in Azure OpenAI - Labor Day Special

All About Vectors, Search, and Function Calling in Azure OpenAI - Labor Day Special

Microsoft Developer

Introduction to project ORAS

Introduction to project ORAS

Microsoft Developer

What I Wish I Knew ... about finding the right major

What I Wish I Knew ... about finding the right major

Microsoft Developer

What I Wish I Knew ... about finding the right major

What I Wish I Knew ... about finding the right major

Microsoft Developer

What I Wish I Knew ... about how to approach programming

What I Wish I Knew ... about how to approach programming

Microsoft Developer

What I Wish I Knew ... about how to approach programming

What I Wish I Knew ... about how to approach programming

Microsoft Developer

Learn Live - Scale from a single node to multiple nodes with Azure Cosmos DB for PostgreSQL

Learn Live - Scale from a single node to multiple nodes with Azure Cosmos DB for PostgreSQL

Microsoft Developer

What I Wish I Knew ... about diversity in tech #1

What I Wish I Knew ... about diversity in tech #1

Microsoft Developer

What I Wish I Knew ... about diversity in tech #1

What I Wish I Knew ... about diversity in tech #1

Microsoft Developer

Get started with SQL Server AGs across Windows, Linux and Container Replicas | Data Exposed

Get started with SQL Server AGs across Windows, Linux and Container Replicas | Data Exposed

Microsoft Developer

Writing LLM Apps with Azure AI and PromptFlow

Writing LLM Apps with Azure AI and PromptFlow

Microsoft Developer

What I Wish I Knew ... about how cool working in tech could be

What I Wish I Knew ... about how cool working in tech could be

Microsoft Developer

Open Source foundation models in Azure Machine Learning & optimization techniques behind the scenes

Open Source foundation models in Azure Machine Learning & optimization techniques behind the scenes

Microsoft Developer

Drasi is a data change processing platform that simplifies detecting critical events within complex infrastructures and taking immediate action tuned to business objectives. This lesson covers the basics of Drasi, its architecture, and its applications in AI safety and change-driven data processing.

Key Takeaways

Subscribe to database change logs using tools like DBZM
Publish change logs to event broker like Kafka
Write stream processing jobs with technologies like Flink
Join two streams, do some processing, and output to another stream
Generate text embeddings from the output stream and store in vector database
Deploy and manage CDC connectors
Write custom code for staple string processing job
Write custom code for embedded pipeline
Connect to existing data sources

💡 Drasi provides a simpler approach for building change-driven solutions compared to traditional solutions, leveraging retrieval augmented generation and fine-tuning for AI safety.

🔒 Pro feature: Ask AI to explain this lesson →

More on: AI Systems Design

View skill →

Architecting Scalable Cloud AI Infrastructure

Architecting Scalable Cloud AI Infrastructure

I Built an AI That Made $3,500 Betting While I Slept

I Built an AI That Made $3,500 Betting While I Slept

Unreal Engine Character Development & Combat Systems

Unreal Engine Character Development & Combat Systems

Explore NVIDIA Metropolis AI-Powered Multi-Camera Tracking on AWS

Explore NVIDIA Metropolis AI-Powered Multi-Camera Tracking on AWS

NVIDIA Developer

Modernizing your Legacy Applications with Crowdbotics

Modernizing your Legacy Applications with Crowdbotics

Microsoft Developer

Accelerate AI on NVIDIA RTX AI PCs with Windows ML | Microsoft Build 2025

Accelerate AI on NVIDIA RTX AI PCs with Windows ML | Microsoft Build 2025

NVIDIA Developer

Related AI Lessons

AI Security Isn't a Product. It's an Engineering Discipline.

Learn why AI security requires a continuous engineering discipline rather than a one-time product implementation, and how to apply this mindset to your AI development workflow

Why Solving Legal AI's Context Problem Is Harder Than You Think

Solving legal AI's context problem requires understanding decision-making processes, not just having large models

Forbes Innovation

How Can We Truly Protect Information Privacy in the Age of Artificial Intelligence?

Learn how to prioritize information privacy in the age of AI and make it a competitive advantage

Medium · Machine Learning

The AI Validation Gap: The $2.5 Trillion Blind Spot In Enterprise AI

The AI validation gap poses a strategic risk to enterprises, costing $2.5 trillion, and requires immediate attention

Forbes Innovation

Containers Don't Make Your AI Agent Safe

Web Dev Simplified