LlamaIndex Webinar: Build Personalized AI Characters with RealChar

LlamaIndex · Intermediate ·🔍 RAG & Vector Search ·2y ago

Skills: LLM Foundations90%Prompt Craft80%LLM Engineering70%

Key Takeaways

The LlamaIndex Webinar discusses RealChar, an open-source project for creating and customizing AI characters, utilizing large language models and retrieval augmented generation for real-time conversations. The project aims to provide equal access to large model technology and allows users to control their own characters.

Full Transcript

everybody Welcome everybody back to another episode of llama index webinar series uh today we're super excited to feature real care uh by Sean and kyoying um build personalized AI characters uh and and all using open source Technologies and so we're super excited they have a presentation and they'll go through some demos as well of this Amazing Project that's exploded in popularity in the past month or so um it has setting at around like 4.6 000 uh Stars right now and so it's it's just gotten very very popular in a very short amount of time um and so uh super excited to dive into this talk a little bit about the tech stack and then also uh open it up to q a talk about some challenges and future directions uh without further Ado passing it to cheshon hey thank you uh thank you for Lama indexes thank Jerry for the invitation and uh um yeah let's get started so my name is Sean and uh Sean I'm the CEO and co-founder of real chart and uh um the top topic today is how do you build your personalized AI characters with realtor and along with uh my co-founder Young so first so yeah at first here is a quick I run out of all things we will be discussing and uh we will try to stick to the timeline given that uh time in the end for Q a so meanwhile if you have any questions you drop in the chat we will run go over them like one by one first uh if you have not heard about real Char and also posting the chat let me know and we will give you uh then it will help us to uh change our content how deep depth of our content first Richard is your real-time AI hackers it's a fully open source project on GitHub it's a One-Stop shop to help you to create customize and talk to AI characters and the air companies is powered by the state of our large language model text that we launched this project three weeks ago and uh at zero said we're gonna get 4.6 thousand Gap Stars also we're building Community we already have 700 inspiration community members doing our Discord so again I'm Sean I'm the co-founder and CEO of real chart and uh why don't you introduce yourself you're saying yeah so I'm playing the CTO role of real child right now building together with Sean yeah we both previously worked at Google and have some experience in self-driving cars too okay next so uh so I'm going to show a tube a very boring demo so first is uh how do you talk to Elon Musk make sure can you guys hear the audio um you might have to do screen sharing with uh the audio toggle enable Okay let me I accept it so let me try again it's interesting okay wait let me try again so let me put some techniques too okay now no okay maybe try one more time yeah just try stopping the chair and then re-sharing without you okay yeah some technical issue it's all good okay uh so once I share the audio it looks like um it stops playing let me try try again okay one more time if it doesn't work work and we have to does it work no no maybe maybe you can just give a narration you can pretend to be the uh yes yes pretend you can't hear uh Elon Musk and uh what we're showing here is allow you to have a very nice conversation with xenom mask at the same time it has a really good personality and also give it a try in real chart you should be able to get the same results and uh um yeah so the first time we want to show you is uh how do you have another conversation with any of the panic figures and also simultize to them for their opinions on the news and on any of the subjects the second is it also allows you to interact with your cappers which means like you can just like human when you talk to a human and you think you've got enough information just ask them to stop and have a really natural conversation so the second well this is also uh you know if you pretend you can't hear the voice it's just a podcast style and uh you know Steve Jobs talking to your mask of all subjects and uh I just pretty much can see like listen to them all day and really fun when they when they talk to each other okay um next let me continue so next I'm going to hand over to perion go through all the technical uh details with Richard actually before that uh this might be jumping the gun a little bit but I'm actually very curious what was like the motivation or spark for this project oh yes so when we build this project we believe everyone should have the equal access to the stage of our larger model technology and uh I we don't think you know chip could all the technology behind the paper is the solution for this this uh genre so we believe uh so that's why we open source all the code base from like the front end from the client side to the whole technical stack um so by doing this we can build a community like I also put in a later slides we can build a a character air companion community so everyone will be able to contribute and own your own characters instead of owned by other companies um yeah so that's like the initial motivation for me to build this technology God it makes a lot of sense were you were you looking at other providers in the space like character High AI or inflection and and that's smart uh the interest in a project like this yes I want to have full control of my character right I also want the calculator to be trained on my data and I haven't controlled the data so that's why we started this and another thing is like we believe the conversation should be more natural just like human talk to human right right now a lot of conversations are term based right you send a text with full response we feel that's like you learn the whole AI talks right we should let AI learn how we talk so that's the right way so that's why we build this like real-timeness into the real chart cool yeah thanks Sean and thanks Jerry for the invitation so let's go a little bit deep into the technical side uh first the overview of our project actually it's actually pretty simple uh we make it really fast to set up and also we only relies on the minimum number of external dependencies so everyone should be able to set it up within a few minutes and we also choose to have like a modular design so every major component in our system can be replaced by a new version or just a different option and has also shown mentioned uh we built like a real-time into our product very early on um because we want to make the conversations very interactive and engaging and the last point is we try to embrace our open source projects as much as possible because we ourselves are open source project and we wanted to use these like a popular options in the community and we also are proud to advocate for like more open source adoptions in the AI space yeah so moving on to the data preparation side this is like on the top side of this overall system architecture the point of data probably preparation is to give memories and context to our AI characters as you know um most of the like language models training data have a cut off date for opening is 11 September 2021 which is almost two years ago yeah and so it doesn't really know too much about the recent news but in Schram's previous demo we actually asked Elon Musk about his view of the proposed cage fight with Mark Zuckerberg that's definitely something the like remotely doesn't know so we have to tell the AI character this is actually something happening right now and also in many cases you may be interested in data that's actually uh have low uh importance in the training data or simply doesn't exist for example maybe it's your own character uh the training data these companies collect data won't have the information you want to have so you you could like just put all these things that you want the AI character to know in the data preparation step and moving on we'll talk let's talk about like how we actually use a llama index in our project so this is kind of like the backbone of what we call character catalog basically owning the data um the piece that manages the um data for each character so um as of now we we choose our quite simple like approach basically we pre-load the offline data into our system uh with for each character and but we are thinking like next time we'll do like a more real-time loading and load also the customized data for example like each user could choose different data styles to be loaded and we also plan to utilize more of the indexes in Lama index for like a better augmentation uh one thing nice about using llama index is we can easily support multiple file formats so now you can use like just plain text PDF HTML ePub like an ebook things like that so people can easily put the relevant data into the system as the like a context data for the character so we believe this is actually uh pretty easy to use and also like we benefit a lot from the existing features in Lama index and next up we also talk about like how we use the embedding and the vector store so because we serve uh multiple characters in our system this this is not a single context but actually multiple contacts so we also keep sound like a metadata for each character to make sure that when we do retrieval uh we're only retrieving for the relevant uh character uh for now we uh primary default to use chroma as our like a vector saw but you can also easily use other Vector stores in the system and there's also a part about the prompts because um to make the air character like engaging you also need to teach the character how you should be speaking and how you should actually have its own like a personality so that's why that's where we actually put someone at the prom engineering techniques to like a fine tune the character um and this is actually where we believe how each character can talk and interact with you differently for example for Elon he is actually very um he has a very noticeable character if you talk to him so I think this is um like a credit to the like a prompt uh engineering work here and this is definitely some kind of manual work through so we have identified that you can actually use chegebt to help you write um these kind of problems for example you can give the prompt we we have for Elum and then you can say like a let's change it to maybe Dr Oppenheimer and he will be able to help you uh we also have a community contributors group for you to automate like collecting data and writing the prompts for a new character so uh if you are interested like a checkout and next up we are talking about uh an interesting piece which is the voice cloning so we didn't have a chance to hear the audio in the previous uh demo from Sean uh due to the technical issues but what's interesting is actually the voice is very much like the actual character's voice we do this currently by using 11 Labs uh they have a pretty neat feature called the instant voice cloning so you only need one minute long audio clip um of the of the character speaking and then you can actually train a model um which can like mitigate its voice and there are also other Solutions available also some like open source Solutions where you can clone your voice and we in the future we wanted to also make it like an automated and probably even by just recording by the user um directly in our UI so that uh user can like automatically get benefit of this feature cool so now it's enough for the data preparation side let's talk about serving design and this is also uh many quite interesting things happen um we want to actually make our system super fast like a super reactive so that's why we put a lot of effort in making sure every piece of in the in the flow is uh like a streaming and real-time as much as possible so there's three major components here is speech to text and the orchestration which includes like about the language model processing Vector store and other auxiliary processes and also then uh back to the like a text to speech to the user so let's um dive into the speech to text first so for the for making it fast we use to we we use client-side speech recognition by default um for example like a chrome and web webcam based browsers have a client-side speech recognition already which works uh reasonably well uh for like a mobile device they use current today they also have client-side speech recognition building so we uh leverage this but their quality is not always good so sometimes they may produce like uncertain result the confidence score may be low and in these cases we'll immediately fall back to our back-end speech recognition uh we used a whisper by default and also we have um switched to faster whisper which is an implementation that's make a whisper quite faster for inference it's also possible to run on API services for example the open AI whisper API and also Google has like its own voice transcribing service so if your machine cannot really run a powerful enough speech recognition model you could also just use the API services you see this is our like a kind of modular design it's easy to add new services for speech to text as long as you follow the like interface so next up we also talk about like a text to speech um this is unfortunately usually the like the lowest lowest like a slowest step in this um process because it actually takes time to construct the other data um again we make it really like a swappable uh you could easily add a new speech engine or I can make tweets to existing one right now we have 11 Labs Google TTS and also uh fellow startup called like unreal speech which also produce pretty good speeches um we also in this part we also do it like really like a streaming um unfortunately like some of the vendors doesn't really have native uh streaming support maybe until recently but we actually have uh sort of like implemented streaming by ourselves uh basically by detecting the like a minimum structures within the tax maybe this is a phrase or sub sentence that we believe it can be uh generate it can you can be used to generate a complete uh Speech then we will immediately send this uh like a piece to the text to speech so that we can actually get back the speech data faster um so that's why we can actually uh generate these speeches in like a pretty real-time manner yeah so that's the point so you cannot really send the word by word because that that will produce pretty bad like uh uh speech output uh because a human naturally like your pronunciation things like that actually needs to depend on um some kind of phrase uh in the context but as long as you are taking out a piece of text that's actually self-consistent self-contained um it can still produce pretty good uh audio output so that's how we actually managed to partially streaming it cool um yeah so I see that the question like a streaming feature just this week we actually take a look it's actually down similarly by what do we do um by just like breaking down the text into smaller pieces and then generating audio from these smaller pieces uh so it's interesting that we actually implemented this kind of thing before uh the uh like their official support yeah and also one thing to mention here is both the speech to text and the text of speech uh have like multiple languages support I think we have eight languages right now uh they pretty much blocked by the tattoo speech uh part because uh this is where the language like a multi language support is weaker uh but I think these eight languages are still pretty like useful for all the users out there now moving on to the orchestration part so this is actually where the language model is working and the song of the heavy lifting is happening um so before we actually like let the language model uh generate uh any response we do a bunch of preparations but when we will retrieve from Vector store for any relevant context um that's how you actually can ask the characters on like a recent news and it will uh how will it be able to respond to that nearly as news we also do some prompt structure so to make sure that these contacts and the previous information are given to the language model properly and we also optionally have a like a half the ability to use actual tools for example we can do Google Search right now we also have integration with quiver recently added so that these these kind of tools external tools can add more like a context to the language model and so the next step is we uh let's learn the language model streams back um to both the tax output to the user and also immediately to the TTS task as we just mentioned in your streaming fashion and also we will save the output in the conversation history and also materialize to a database um for like a future references and in this place uh what I want to highlight is the language model party is also pretty much modular design so you could easily switch from openness model to as a Roblox model I know we also recently added support for the lamba 2 model from meta um it's actually also uh pretty easy to add new language models and we are also working with some other open source libraries which enables you to easily run language models within real-time um so maybe a question about like you said optionally use extra tools like you said search and quiver um are you kind of implementing this as like an agent uh abstraction or or how does that work so it's not in the like a formal agent uh structure at this point uh because it actually it's actually pretty easy uh to implement um we also wanted to make it really fast so using agent structure usually sometimes it's actually slow to uh runs through all these like uh steps um but but yeah I think we will be adding more complexity into this part so it probably will look more like agent going forward cool makes sense um the the um I know I'm just jumping I interpret you mentioned like conversation history and I'm kind of curious like how do you maintain like conversation history per session and how do you think about like sessions over time are you able to like reload a conversation in history and just pick up where you where you started uh how are you think about that yeah this is a good question so right now we the conversation history uh starts when uh with each uh each uh uh like a chat session uh we don't have we don't yet have the like a feature to you like a preload the previous composition history but we believe this is uh this is actually our uh our plan we are going to implement the very soon um with that of course there will be a problem that if you you are like a conversation becomes really long um so there that's uh where I think we will add the sound like a uh processing uh maybe to summarize the previous conversation history how to summarize the key like Memories you had with this uh AI character so that uh the conversation history is still like manageable yeah what's really interesting to me from like a ux perspective is that once you take extend this and start building relationships with these like AI characters uh and you start having these like longer term conversations and and you know just like build up that memory structure it'd be very interesting to see how that evolves those conversation comes in and how you actually are able to not just store the raw conversations but extract out like the main memory modules and how you represent that is that like in a vector stores and then a Knowledge Graph uh those types of things uh over time yeah I think there are a lot of things to be actually play around here uh you could put something in the combo in the literal like a conversation history you could also put it just somewhere for the language model to know like your character so he knows how to interact with you better yeah so as you said there's there's actually a bunch of interesting things we can do cool cool so let's move on to the summary so overall we believe there are two principles we are taking to making to make the wheelchair really good uh one is the design for like real-time conversations um including all the like a streaming pieces and also like how we utilize the retrieval augmented uh generation scheme for better conversations and I think this is where we draw a lot of Inspirations from llama index and others in this field to make it really work uh I think if you think about probably six months ago we already have jgbt but people are probably not so uh familiar with all these ieg techniques but today we have done like a ton of research and practice into this space and we we know how to generate these conversations with all these like external information to make the conversation better so this is why we think the the conversation is really fun and entertaining in English great yeah um that's a great presentation um maybe maybe just uh taking a pause right here and uh taking some questions from the audience uh so tal asks could you please share details on the costs of broadened a web app like the one that you showed in the demo um I imagine this is relevant for anyone trying to kind of spin up like a hosted service using using llms yes so uh you know to break down all the whole all the costs right so if you look at the overall uh architecture so there's several calls coming from uh this like the the choice of Technology um so there's like the voice clone and the Text Speech but current Union actually that is taking the majority of the costs right now and that's why we are actively switching to other uh types of speech service like unreal speech so we just we just added them like two days ago and the Google Text-to-Speech and we are also like thinking about a we also have a small group of people researching on the park so then we will have a hosted version of self-hosted version so you there's no additional cost for detective speech and voice clone and then the second part is um let me take it yeah the second part is calling a larger model right so if you use GPT so gbt 3.5 is relatively cheap so 4 is definitely a little bit more expensive but we also give you a solution for llama2 right so we just add a llama to so if you want host yourself it actually works the same prompt the same setup just switch that to a llama too I test actually so in the code base right now we use the 70 billion llama V2 but I also tried the 7 billiona and 13 billion both works so if you really uh worry about the large number of cost you can definitely switch to a local version then it's then then the last one is the hosting cost right if you use a deployment we currently host on TCP and if you don't but we also offer a talker solution right so you that pretty much like you can host on your local computer and expose a endpoint right then that's free so but our goal is to make everything self-hosted so have a open source alternative then in the end of the day it's just your electric bill and uh that's it um there's actually the oh sorry all right I just want to cook it out like a GPD 2.5 is already pretty good in your use case because we make our prompt really easy to follow for the model so even if you're just using gbg 3.5 it also works pretty well just a consideration for the cost yeah sweet um this actually brings me to a follow-up uh ux question which I was going to think about when you were talking about like the speech attacks and tax-to-speech uh do you have like Telemetry showing like for for a lot of users are they really just interested in like tax base interactions or they're interested in kind of like the end-to-end like speech-based interactions pan I think we have a number right oh sorry what is the question maybe I don't fully uh follow the question is it just like are are people interested in interacting with drill care mostly through like text-based like they just type in stuff via chat or you know you have the tax the speech to text as well as taxes speech really are they uh interacting with real Harris through like voice basically yeah okay that's a good question I think they are probably equal number right now uh because uh tax is really easy to use and try to start with um yeah but once people are more familiar with the system I think a lot of people actually use the voice part yeah so we have seen like uh both are probably very popular cool makes sense um next question from the audience uh could you please talk a little bit more about how you're using fast whisper for restraining audio um were there any challenges that you faced in this yeah so the like a whisper visible doesn't natively support like a streaming um so what do we do is also similar like you we just detect uh when you actually have a pause and then we would like pick up these like audio uh clip and send to uh whisper for uh influence um so I think so far it's actually been pretty good um they're the like a quality is pretty good definitely better than the client-side uh speech recognition um it's uh it's definitely like a little bit slower than the uh like a client-side and speech recognition so we are like working on to actually make it uh even faster um the the next question uh is actually related to something I was kind of curious about so you mentioned like a system prompt for a character like for Elon Musk there's like a system prompt for for Elon um yeah for for uh when a question is asked to like any given character um do you have like the same prompt you use every time as like the base system prompt uh and are you thinking about like kind of keeping that evolving that even doing some sort of like uh retrieval augmentation or or tuning on that prompt over time um and maybe we could start there yeah I can't think of the question so if character have their own system problem right now so but right now all the systems are hard-coded so which means like when you load the character it's predefined but we are working on dynamically changing the system problems so uh I visual that's like it's managing the stage of the character right so you have a two system one is manager of the stage of the character and won't manage the state of the human right when you have the conversation so those will be like changing dynamically during the conversation and uh that's how I Envision this and yeah that's the answer question yeah yeah so what do you do something like work for people augmentation uh over time like uh basically filling in the relevant context and system prompt as well yes interesting um uh I think the next question asked about like uh I guess the broader question is like multilingual support So support for languages like Chinese other languages what are your thoughts on that yeah so unfortunately like a child we don't have good Chinese like a tattoo speech um solution right now um like a at least like with this existing providers uh there are probably also some other languages that people want to use yeah so if there are you if you find a good like a TTS engine feel free to let us know um but we will also be exploring on this space here the library is usually just blocked by the TTS path yes we have a pretty high bar on the TPS so that's why like it's really hard to find a service that's like really satisfying and really fun to like listen great um I I have a general question and this is something that would be good to get your thought on what are some of like the biggest challenges you face in trying to like build uh this thing using like open source Technologies and how are you thinking about like the the biggest kind of like items in your mind that you want to prioritize for future work so Sherry also like we still have like some contents like should we save some questions in the end or I I do not realize that yeah please go for it yeah okay yes yes let's keep all the questions like in the in the end we still have a few slides to share got it so okay yeah next so there are some fancier demos um yeah so first is we are uh you know you are the first group to see and we are secretly working on um some Vision Pro version for the real chart and uh you know probably next year where you when everyone's starting to get uh Vision Pro you will be able to talk to any of your favorite characters and uh in your living room so um sorry it's you still cannot hear the audio but like I promise you it's there and uh it's really awesome to hear uh Bruce Wayne talking to you okay um yeah so this is one of the demos and oops um all right so the next demo uh so this one is a very short demo so like we're like starting from today you will be able to create a character without it touching any code so we provide the system prompts ask you to um choose the voice and that's it like 10 seconds right if you want to clone your own uh you know mask 10 seconds you will be able to do that so I think I'll answer some of the questions in the channel we are going to make this more uh customizable but this is the very short version and like I said uh you you uh you guys are the first time first first uh group to see that to see this next and we are working with some very interesting Technologies so uh if you know you know so this is when you're working with you know mask and try to ask him to hire you in the most uh uh Mission and the story for the language I want to show you like this is like when you actually talking to uh using some of the Technologies you are you know must what decides like he wants to continue conversation or not so uh yeah so this is part of the stage so I can imagine how do you manage like for the character and uh so then the last one is we also work with our attack to have some technology to show you some animations further for the avatar okay um right so then the last last one um I'm not sure how many people have seen this and uh we also part uh working with multi-on to bring the AI agent to a real chart so you can actually send Twitter and to book flights and do a lot of things for you and you will be able to stay in the same chat window and continue have your conversation but at the same time getting worked on okay those are some of fancier demos and uh let me continue yeah so uh that's like uh Richard writing Rich heart is uh your real-time AI character is powered by the whole community so we are embracing a lot of Technologies like developed by the community uh both open source and the cold source so uh from the Avatar interactions that Arbor Tech and there's I have a table so if you want to later on take a look so you can do that and for the character States we work with a company called open source source and they have an open source version called social API and uh for for the back end we are replacing like that we are replacing marginal models with self-hosting ones we're working with mental ml working on working on replacing those and for audios we partner we are also working with uh on real speech to give you a more realistic speech for the larger mode and data you know all the big names like llama index and then chain and the chroma right so we work with them closely to make sure the larger model Technologies like when we build rack are like state of Art we also work in with one of the openings of projects called quiver you will be able to store your data to your to quiver and pull all the information from one query the last one is we're working with multiple AI agents right right now is multi-on to help you to work get worked on we are also adding more AI agents in the platform stay tuned um yeah so here's a table of all the solutions as mentioned uh what are the solutions are open source and the wireless Solutions are closed source and uh um you know electron will probably will be able to share the slides with everyone so you can just like take a look at uh all their all the cool things they're building so in short terms we're building the open uh open source AI community and uh for a character community so everyone will be able to own your characters and be able to deploy privately and with no external service dependency and we're also working on long-term memory right we use it in Iraq and working with a lava index you're working on make sure we have the best like rexes and uh here in the longer term we want to build the largest open source a year character and air companion community so then you everyone will be able to collaborate and share your thoughts on how to make your character and companion more more real like humor right and at the same time we want to make sure the technology we are building are accessible to everyone the last is we want to have make sure the conversation is really real time and can be everywhere like even your Vision Pro and it's full-time fully customized and learned from your own data okay so that's that's that's it so uh so uh those are two barcodes and you know like this AR barcode and uh scandals and uh join a Discord or check out our latest repo and uh now we're ready for questions sweet that was the Fantastic started download uh thanks for sharing um but maybe uh going back to the questions from the audience um I think uh Dominic says uh I found your iPhone app to work really well compared to running it uh locally on on the MacBook and potentially has to do with like integrated speech attacks but what are your thoughts in general about like the form factor like Mobile versus uh like laptop versus I got like a VR headset yeah so I think for more for Native apps and the web um for for web like there's like the different web browsers like Edge Firefox blind there's new arc right so everyone have their slightly own different implementation but for uh eyelash apps if you're using like a mobile app is pretty standard right so we have fun like to compatibilities is hasn't been an issue for us so that's why we are switching more to the back-end Solutions instead of doing like speech recognition in the front end so uh also like once we have the faster uh faster whisper working better so I think we should have even better support for the web for the web users yeah but also for like a form factor it's actually interesting that uh many people are using like the mobile app to interact a single the engagement on mobile device is also pretty high so this is actually also aligned with our vision that we wanted every we want to reach out to be running on every device possible um you would like to use so this is um yeah our roadmap to like add more like a platforms for example Android is a work in progress um so that's something that we definitely log into to bring it to more platforms awesome um for the next question uh Logan asks um maybe this is just uh on on the IOS app when I tried it I realized that the bot is very eager to answer so if you pause a little it jumps in and uh it doesn't wait for the user to finish how do you actually think about that how do you think about like waiting for the user uh or kind of like addressing that Gap to make sure that the complete sentence is uttered before like giving a response yes I think that's a good question um so then there is like uh like a cost effective right so like if you want to do more detections that means like you have to know okay if the user has complete sentence or not I have uh if you read the code I have a secret experimental working on called algorithms right so trying to make sure uh it's already trying to detect if the user already finished a sentence in the back end right trying to predict if actually a finished sentence so I think in the long run we will make her probably make it more systematic automatic but right now both of the implementations are just with probably one or two seconds so in the in the front end so if it's not detecting any speech so it was starting to talk like trying to predict okay you already finished your content sweet um I don't know if you guys have thought about this but like do you think about how to minimize like hallucinations give an Annex or contacts that you recruit yeah I think this is the part about the rec and the State Management like for the character right I think at least from my personal uh experience at hallucination usually cause comes when you pull the wrong context right when you pull the wrong information so uh if you if you find the really relevant information it can actually pretty much stay in character so I think for us the most important thing for us is how do you make sure when we like store those data in our systems if it's in the like Knowledge Graph or is it in the background there is we pull the information the relevancy increases and at the same time we monitor those like uh monitor those contents actually pulled for the question so this is more even a harder question and long-term question when you have to answer continuously when we build the system and adding more data into the system so then the second part of what Hallucination is how do you actually solve the prompts right so like when if you when you look at the question if you look at the sample question we sent to about the model uh a long conversation happens is like if you are human you cannot answer the question right how do you expect your life remember to answer the question uh correctly yep one interesting thing I've personally noticed when kind of um doing retrieval augmentation is that when the language model actually knows about the the person or the character then sometimes you have to like force it to try to actually use the context from your knowledge base uh as opposed to trying to like use its own internal knowledge and so how do you actually balance between the stuff that I already kind of knows about Elon Musk versus like you know stuff that's actually up in your Upstate knowledge bases uh something that's interesting it might require like prompting it might require like explicitly telling the L1 to forget stuff uh but that's like especially since a lot of the characters that you have are like public characters I probably noticed about operatory yeah so we also have private so like uh Community characters like uh there's cat there's dolphin there's a few other characters so I would create like with no prior knowledge of those characters I mean for us right now we'll find is like you have to have more data for those characters right so like for the family figures they already exist in the larger model like during the like the training data cut cut off right but for the newer character if you're around to like create any character like uh um but then you have to provide more data and in in a data folder so that's a part like when we retrieve you'll be able to reference to those so we have thought about those like but we haven't experienced like a major issues because those are characters right sometimes like okay it's okay they they be more a little bit creative got it got it um this is a nice segue into the next question and maybe we'll take like uh two to three more before you close out um but uh so uh Steve says I've tried using GPT to invent a character from the very beginning with a very basic prompt um and so basically the entire character is actually created with GPT how are you thinking about like letting users like kind of build these like very custom characters versus uh you know automatically uh via like human specification like uh how would that like uh loop into your product experience yes I think for uh you know what we are building in the new category creating experience I think we want to exactly that right so like we want when you build the characters you get help from the GPT and gp4 so uh we were stuck in front from that right and uh so right now we provide a very like standard prompt so you can just copy paste that if you want to copy it probably feature just replace name that everything works but in the future if you want to do more like uh you know your new new uh new character clone we how we imagine is you just provide a lot of data in a data folder a lot of data we also help you to sort out how do abstract personalities how do we abstract since the prompts from this and uh um yeah so like like kind of mentioned there's also a community script help you to like get all the data you need and to sort out that particular personality I imagine this will be more automated and everyone and once you have it you need to set it up then the more you use is give it more data it will be more like that character you want um that's just how I Envision this um gotcha um and then maybe maybe the last question is um how can these uh characters or agents uh help you do work uh like send emails type calls update records and actually more broadly and just something I've been kind of thinking about like um I feel like the the human perception of like agents I can help you do stuff versus like uh personalized character that you can talk to are too slightly distinct uh use cases and I'm curious like how you think about that like would you see people uh do more of one or the other or a combination of both and and what's your vision there yes uh so high Vision like you know character or AI companion or agent is a spectrum right like there's people who extremely care about productivities there's people who only cares about like emotional like have someone to talk about so it's a spectrum so we're trying to balance what are the tools we bring to that Spectrum right from the very interesting characters you can just have fun talking to them have a game with them if you like sending emails so that's why we partner like with multiple companies on this spectrum right so for the AI agents we work with a few like leaders in the space try to help you to like send emails but for example if you ask your masks things things and you want to help him to like in his tone to send the email you can do that right so um yeah so but there's like like I said there's a trash problem so people may line differently on the Spectrum but we also want to like little chart be offering the tools to help you to build this uh the whole experience awesome well this is super exciting stuff uh thanks Sean uh Kelly for your time um thanks for the super exciting webinar and the really cool demos especially the uh Apple Vision Pro uh demo and all the other ones and so um yeah thank you again and I think the audience uh really enjoyed this talk and we'll have the recording up on uh YouTube for anyone who missed a live recording so thank you have a good day thank you bye thank you

Original Description

In this webinar, we chat with Shaun and Piaoyang (creators of RealChar) on RealChar - create, customize and talk to your AI Character/Companion in realtime. Github: https://github.com/Shaunwei/RealChar It's exploded in popularity the past few weeks, and promises to be an open-source alternative to proprietary services such as Inflection/Character.ai, and more. It stitches together a variety of open-source AI solutions: llamaindex/langchain, chroma as vector db, speech to text and text to speech, etc.

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from LlamaIndex · LlamaIndex · 21 of 60

← Previous Next →

LlamaIndex Virtual Meetup (May 4th, 2023)

LlamaIndex Virtual Meetup (May 4th, 2023)

LlamaIndex + MongoDB Workshop/Fireside Chat

LlamaIndex + MongoDB Workshop/Fireside Chat

Discover LlamaIndex: Ask Complex Queries over Multiple Documents

Discover LlamaIndex: Ask Complex Queries over Multiple Documents

Discover LlamaIndex: Document Management

Discover LlamaIndex: Document Management

Discover LlamaIndex: Joint Text to SQL and Semantic Search

Discover LlamaIndex: Joint Text to SQL and Semantic Search

Discover LlamaIndex: JSON Query Engine

Discover LlamaIndex: JSON Query Engine

LlamaIndex Webinar: Active Retrieval Augmented Generation

LlamaIndex Webinar: Active Retrieval Augmented Generation

LlamaIndex Webinar: Demonstrate-Search-Predict (DSP) with Omar Khattab

LlamaIndex Webinar: Demonstrate-Search-Predict (DSP) with Omar Khattab

LlamaIndex Sessions: Practical challenges of building a Legal Chatbot over your PDFs

LlamaIndex Sessions: Practical challenges of building a Legal Chatbot over your PDFs

LlamaIndex Webinar: Graph Databases, Knowledge Graphs, and RAG with Wey (NebulaGraph)

LlamaIndex Webinar: Graph Databases, Knowledge Graphs, and RAG with Wey (NebulaGraph)

LlamaIndex Webinar: Community Project Showcase (07/07/2023)

LlamaIndex Webinar: Community Project Showcase (07/07/2023)

LlamaIndex Webinar: LLMs for Investment Research (with Didier Lopes, co-founder/CEO at OpenBB)

LlamaIndex Webinar: LLMs for Investment Research (with Didier Lopes, co-founder/CEO at OpenBB)

Discover LlamaIndex: Bottoms-Up Development With LLMs (Part 1, LLMs and Prompts)

Discover LlamaIndex: Bottoms-Up Development With LLMs (Part 1, LLMs and Prompts)

Discover LlamaIndex: Bottoms-Up Development With LLMs (Part 2, Documents and Metadata)

Discover LlamaIndex: Bottoms-Up Development With LLMs (Part 2, Documents and Metadata)

Discover LlamaIndex: Key Components to build QA Systems

Discover LlamaIndex: Key Components to build QA Systems

Discover LlamaIndex: Bottoms-Up Development with LLMs (Part 3, Evaluation)

Discover LlamaIndex: Bottoms-Up Development with LLMs (Part 3, Evaluation)

LlamaIndex Webinar: From Prompt to Schema Engineering with Pydantic (with @jxnlco)

LlamaIndex Webinar: From Prompt to Schema Engineering with Pydantic (with @jxnlco)

Discover LlamaIndex: Bottoms-Up Development with LLMs (Part 4, Embeddings)

Discover LlamaIndex: Bottoms-Up Development with LLMs (Part 4, Embeddings)

Discover LlamaIndex: Custom Retrievers + Hybrid Search

Discover LlamaIndex: Custom Retrievers + Hybrid Search

LlamaIndex Webinar: Document Metadata and Local Models for Better, Faster Retrieval

LlamaIndex Webinar: Document Metadata and Local Models for Better, Faster Retrieval

LlamaIndex Webinar: Build Personalized AI Characters with RealChar

LlamaIndex Webinar: Build Personalized AI Characters with RealChar

LlamaIndex Webinar: Make RAG Production-Ready

LlamaIndex Webinar: Make RAG Production-Ready

LlamaIndex Workshop: Building RAG with Knowledge Graphs

LlamaIndex Workshop: Building RAG with Knowledge Graphs

Discover LlamaIndex: Introduction to Data Agents for Developers

Discover LlamaIndex: Introduction to Data Agents for Developers

LlamaIndex Webinar: Finetuning + RAG

LlamaIndex Webinar: Finetuning + RAG

Discover LlamaIndex: SEC Insights, End-to-End Guide

Discover LlamaIndex: SEC Insights, End-to-End Guide

Discover LlamaIndex: Custom Tools for Data Agents

Discover LlamaIndex: Custom Tools for Data Agents

LlamaIndex Sessions: Building a Lending Criteria Chatbot in Production

LlamaIndex Sessions: Building a Lending Criteria Chatbot in Production

Discover LlamaIndex: Bottoms-Up Development with LLMs (Part 5, Retrievers + Node Postprocessors)

Discover LlamaIndex: Bottoms-Up Development with LLMs (Part 5, Retrievers + Node Postprocessors)

LlamaIndex Webinar: How to Win a LLM Hackathon

LlamaIndex Webinar: How to Win a LLM Hackathon

LlamaIndex Webinar: LLM Challenges in Production (w/ Mayo Oshin, AI Jason, Dylan from Starmorph)

LlamaIndex Webinar: LLM Challenges in Production (w/ Mayo Oshin, AI Jason, Dylan from Starmorph)

LlamaIndex Webinar: Agents Showcase!

LlamaIndex Webinar: Agents Showcase!

LlamaIndex Webinar: Learn about DSPy

LlamaIndex Webinar: Learn about DSPy

LlamaIndex Webinar: Time-based retrieval for RAG (with Timescale)

LlamaIndex Webinar: Time-based retrieval for RAG (with Timescale)

LlamaIndex Webinar: Build/Break/Test LLM Apps Showcase (co-hosted with TrueEra, Pinecone)

LlamaIndex Webinar: Build/Break/Test LLM Apps Showcase (co-hosted with TrueEra, Pinecone)

LlamaIndex Workshop: Evaluation-Driven Development (EDD)

LlamaIndex Workshop: Evaluation-Driven Development (EDD)

LlamaIndex Webinar: Building LLM Apps for Production, Part 1 (co-hosted with Anyscale)

LlamaIndex Webinar: Building LLM Apps for Production, Part 1 (co-hosted with Anyscale)

LlamaIndex Webinar: Learn about Fine-tuning + RAG (w/ Victoria Lin, author of RA-DIT)

LlamaIndex Webinar: Learn about Fine-tuning + RAG (w/ Victoria Lin, author of RA-DIT)

LlamaIndex Webinar: What's next for AI after OpenAI Dev Day?

LlamaIndex Webinar: What's next for AI after OpenAI Dev Day?

Introducing create-llama

Introducing create-llama

LlamaIndex Webinar: PrivateGPT - Production RAG with Local Models

LlamaIndex Webinar: PrivateGPT - Production RAG with Local Models

Multi-modal Retrieval Augmented Generation with LlamaIndex

Multi-modal Retrieval Augmented Generation with LlamaIndex

LlamaIndex Webinar: LLaVa Deep Dive

LlamaIndex Webinar: LLaVa Deep Dive

A deep dive into Retrieval-Augmented Generation with Llamaindex

A deep dive into Retrieval-Augmented Generation with Llamaindex

LlamaIndex Workshop: Multimodal + Advanced RAG Workhop with Gemini

LlamaIndex Workshop: Multimodal + Advanced RAG Workhop with Gemini

LlamaIndex Webinar: Efficient Parallel Function Calling Agents with LLMCompiler

LlamaIndex Webinar: Efficient Parallel Function Calling Agents with LLMCompiler

Introduction to Query Pipelines (Building Advanced RAG, Part 1)

Introduction to Query Pipelines (Building Advanced RAG, Part 1)

LLMs for Advanced Question-Answering over Tabular/CSV/SQL Data (Building Advanced RAG, Part 2)

LLMs for Advanced Question-Answering over Tabular/CSV/SQL Data (Building Advanced RAG, Part 2)

LlamaIndex Webinar: Advanced Tabular Data Understanding with LLMs

LlamaIndex Webinar: Advanced Tabular Data Understanding with LLMs

Ollama X LlamaIndex Multi-Modal

Ollama X LlamaIndex Multi-Modal

Build Agents from Scratch (Building Advanced RAG, Part 3)

Build Agents from Scratch (Building Advanced RAG, Part 3)

LlamaIndex Webinar: Build No-Code RAG with Flowise

LlamaIndex Webinar: Build No-Code RAG with Flowise

LlamaIndex Sessions: Practical Tips and Tricks for Productionizing RAG (feat. Sisil @ Jasper)

LlamaIndex Sessions: Practical Tips and Tricks for Productionizing RAG (feat. Sisil @ Jasper)

Introduction to LlamaIndex v0.10

Introduction to LlamaIndex v0.10

Build SELF-DISCOVER from Scratch with LlamaIndex

Build SELF-DISCOVER from Scratch with LlamaIndex

Introducing LlamaCloud (and LlamaParse)

Introducing LlamaCloud (and LlamaParse)

LlamaIndex Sessions: 12 RAG Pain Points and Solutions

LlamaIndex Sessions: 12 RAG Pain Points and Solutions

LlamaIndex Webinar: RAG Beyond Basic Chatbots

LlamaIndex Webinar: RAG Beyond Basic Chatbots

A Comprehensive Cookbook for Claude 3

A Comprehensive Cookbook for Claude 3

LlamaIndex Webinar: RAPTOR - Tree-Structured Indexing and Retrieval

LlamaIndex Webinar: RAPTOR - Tree-Structured Indexing and Retrieval

The LlamaIndex Webinar discusses RealChar, an open-source project for creating and customizing AI characters, utilizing large language models and retrieval augmented generation for real-time conversations. The project aims to provide equal access to large model technology and allows users to control their own characters. By following the steps outlined in the webinar, users can build personalized AI characters and create customized AI characters for various use cases.

Key Takeaways

Use client-side speech recognition for fast streaming
Fall back to back-end speech recognition for uncertain results
Switch to API services for speech recognition if necessary
Use 11 Labs Google TTS and Unreal Speech for text-to-speech
Implement streaming by detecting minimum structures within text
Dynamically change the system prompt during conversation
Create a character without touching code
Choose the voice for the character

💡 The use of retrieval augmented generation and large language models enables real-time conversations and personalized AI characters, providing a unique and interactive experience for users.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related AI Lessons

Why you shouldn’t search your documents directly with AI

Learn why directly searching documents with AI can be inefficient and how retrieval-augmented systems can improve the process

Medium · Programming

Your AI Keeps Making Things Up. RAG Is How You Make It Use Real Facts Instead.

Learn how to use RAG to make your AI provide accurate answers based on real facts instead of making things up

Evaluation Metrics for RAG: Measure Retrieval, Generation, and End-to-End Quality With Numbers That…

Learn to evaluate RAG models using metrics that measure retrieval, generation, and end-to-end quality

Evaluation Metrics for RAG: Measure Retrieval, Generation, and End-to-End Quality With Numbers That…

Learn to evaluate RAG models using metrics that measure retrieval, generation, and end-to-end quality

Medium · Data Science

RRF vs DBSF with Qdrant: Hybrid Retrieval Fusion for RAG in Python

Professor Py: AI Engineering