Open Assistant Live Coding (Open-Source ChatGPT Replication)

Yannic Kilcher · Intermediate ·🧠 Large Language Models ·3y ago

Skills: LLM Engineering90%Fine-tuning LLMs80%Prompt Craft70%Prompting Basics60%

Key Takeaways

The video demonstrates the live coding of an open-source ChatGPT replication, focusing on fine-tuning, prompting, and deployment of large language models (LLMs) using various tools and techniques, including GPT-J6b, Stable Diffusion, and FAISS.

Full Transcript

hmm foreign [Music] foreign let's see this thing needs to go here here ah it's a bit large that's better and this can probably be like this foreign hello will stability be involved I don't I don't know not yet which program is this for streaming this is OBS I think pretty much everyone uses obs at least for YouTube I think it's it's open the thing covered my message yes sorry talk about censoring and safety mechanism from what I've read that's a major issue I don't think it's a a major issue um so it's a major issue in in the open AI system but the safety is just on the data side so um personally identifiable data things like this um yeah that's we have to we have to somehow well first of all make clear that when users submit data that we are going to use it to train stuff and so on so they should not submit sensitive data that they don't want in there um and so on so um yeah and then I think it comes down to to to cleaning the data set but I I am gonna Advocate strongly to not put in this this stupid I am a large language model I am not able to to do this and that like that's it's so it's so pretentious to think yeah it's so pretentious to think that you you are the moral Orbiter for the rest of the world though what Hardware sorry I'm I haven't eaten yet so I'm I'm gonna Munch chocolate um what Hardware will you run this on um So the plan is the plan is to get it to a point where you can run it on consumer Hardware now initially that might mean or to a point where the individual person can run it initially that might mean you need like an AWS machine a cheap one um but I think with stable diffusion it was pretty cool to see that once they put the model out there other people took it and then made it smaller made it more lightweight so that it can run on smaller and smaller machines and if we achieve that as well we only need to bring it down to you know reasonable size so that other people can take it and then make it even more efficient and and so on so yeah yeah um what are you doing for the new year this you see every time every time I go on GitHub and I refresh the notifications they're already 30 again 30. 30 of you people have I mean no this is it's absolutely overwhelming um to the the level of Engagement that people have so I I put the the GitHub link somewhere up here uh you can check out the GitHub you can grab an issue you can also make issues right that you feel so my job is twofold first of all um we have we have okay I will so for for privacy for my privacy I will every now and then have to to um hide stuff so I quickly need to read something that you're not allowed to see therefore I'll do this and I'll read whatever I need to read and then I'll come back um yeah that's that's just the life because it's um yeah could because I also have um my day job is a startup so nice good okay that's very good very I got very good news so the job today is Administrative believe it or not um we do have this uh this this Discord here which is for work coordination so this is not if you're just interested and so on feel free to follow us on the GitHub repository on the um your appropriate channels in the lion Discord and on our Discord the the Yana culture Discord um the internal one here is for work coordination so that's really for people who want to contribute um so if you grab a GitHub issue or something like this um you know we you you'll get in there also if you if you use the sign up form I've disabled the sign up form right now we have to see how to continue here because it's just an overwhelming amount of people which is crazy um I'm really happy but then also okay here's what we're gonna do we're gonna go through the new people in here who all are contributing or wanting to contribute um yeah wait can you okay chat I need to chat and this okay I'm not a good YouTuber now uh yet can we swap the map mole with random operation so it can be better sparse but sure it's just not given that it will also work there is like people people who make things faster they sometimes do trade-offs for fastness where stuff stops the work and then it's like okay you just you essentially make a a heater um but yeah if you can find something that works as well as Matt Mull uh in the attention but I mean that would be killer um how many how many number of parameters are good for Consumer GPU um you know I I previously for a previous project which I'm not going to name but you might know it I I use the gptj style model um and uh it ran fine on on one GPU so I'd like a GPT j6b so it's it's probably going to be around that size maybe a bit larger because larger models do have more knowledge um in like inside so yeah uh what's your take on diffusion models for text um well the fusion models seem to work for generative things uh text is not that easy to like noise and that has always been a problem with text like as long as we know image models have just been way nicer for deep learning because everything's like continuous and differentiable and text is super like it's it's just discrete and that's not good so it but if someone can figure out some discrete combination yeah um what about training the model with academic papers okay this is an important Point sure have fun I'm I'm I usually don't I don't really drink alcohol but I'll have I have like an iced tea or something I'm very boring um no this it's an important an important point which I think many people Miss in the whole what do we like what data and so on it's it's quite clear that with for example chat GPT um the all of the all of the capabilities and all of the knowledge is already in the model like in the in the base model in the gpt3 that they use the human feedback on pretty much all of the knowledge is already in there so there is not like you you cannot use or I don't think it's the purpose of this human feedback loop to put more knowledge into the model like people say oh we can tune it to do math and so on to a degree but what's much more relevant with this human feedback is you just want to use the human feedback to get it to get the model into like a mindset to get it into a prior state of really being aligned with what the human wants like to be this this assistant right and you really wanna that's what you do with the human fine tune you don't you don't put new knowledge in there you don't put new abilities in there you simply getting into a prior into a state where it's where it knows essentially knows it's tuned to fulfilled tasks and but the knowledge that to do that is already in there and you can see that chat GPT is like super good at answering questions about quantum physics and math proofs and coding yet open AI says at least in the instruct GPT paper they used like crowd workers common common crowd workers which you know you are a crowd worker if you are if you if you don't um what should I say it's it's a it's a possibility for people to earn a bit of money on the side and usually people with a higher degree of Education they don't need that so there's there's just going to be um a lot of people with let's say less education maybe you know or or different education if you're like a literature major or something like this but not not Quantum physicists well though quantum physics probably I'm not sure if that's a good paid job or not however the so you can already see that that this knowledge does not come out of the out of the human fine tuning um and and that's I think quite important to to realize so yeah so so the the whole human feedback thing is to really get the model into this task fulfillment mode and that means we need diversity of inputs and that's why we need you that's why we need humans to come up with prompts to come up with Answers by the assistant and so on and that's why we are building right now this this whole infrastructure to collect the human data and that's what this this Discord is about um yeah so that's that's what the whole efforts are all right all right so wait that was one chat question sorry very sorry to make it lengthy I'll be more quick can you make it browse the web and not be static like chat GPT um yes yeah that's the plan uh not the initial plan but the plan is to make it browse the web um to to dynamically retrieve information we think we have a good idea of how or good ideas of how this could be done and we're pretty sure that you will have even better ideas of how that can be done so yeah I'm I'm fairly confident we can do that um let's see [Music] yeah can you stop talking sure uh [Music] where can I find the link to join the Discord Channel if you're um if you're if you're active on on like follow follow us on the like The Lion and the the our Discord and I'll eventually if I have a better way of managing all the newcomers here I'll I'll post links there um yeah but this is so this is what we're gonna do right now I'm gonna look through the new people who came in and we're gonna see what who they are so they need to introduce themselves ml practitioner what we'll do is we delete the unverified role and we'll add the verified role I know this can be automated um I know but it's a bit of of screening because this is really for work coordination so we want to know a little bit who people are um and oh sorry um yeah I don't I don't know that there's a there's a good way because as soon as people realize this is automated they're just gonna write like ASDF and stuff we're aiming to build an all-in-one chat model we're aiming to build an assistant like a true assistant like a thing that you can give a task and it will do the task at least for for knowledge work right it's not necessarily vacuum my kitchen but for for knowledge work so it needs to be able to retrieve things it also needs to be able to interact with third-party systems like to essentially take control of your mouse uh and keyboard and go to another application and do something although that's quite a bit into the future and there are other cool projects that do that one is called robot me that also plans on that we'll see if they if they um maybe want to collaborate this could also be built into applications essentially if you want to make your application AI powered um to have your application communicate with the human in a more natural way uh this could also be you used for that um ml engineer interested in improving very cool thank you for being here yes so my job is essentially going through these um reading them and and letting them in that's that um yeah that's one of my jobs the other job is to go through GitHub issues categorize them assign them and so on foreign [Music] so this person actually is already in on a GitHub issue so we'll also give them the contributor role it's just about keeping the noise down for now like really very cool very cool Tim that's oh Tim Tim's doing robot me hello Tim thanks for being here would love to join see I I say it and it happens amazing can it use AI to validate roles yes yes we could absolutely Tim um watching the stream very interested very cool thank you for being here foreign excellent software developer amazing and new to Discord you joined for us thank you thank you that's cool okay caught up so far so Mars typing um but we'll sneak away before that so the other thing is okay um reading up on discussions uh should end up in that repo will it have persistent memory hopefully I mean that's that's still uh that's still out to do right who who knows how to do that persistent memory that's what kind of background have the contributors all kinds of background all kinds um really so like I don't I don't necessarily care too much about the background more of what what people want to do and can do um yeah there's a TechCrunch article mentioning us now we trended yesterday on GitHub how crazy is that um oh oh oh oh oh oh oh oh oh hmm okay we're gonna have to solve this this here if someone wants to do something pre-commit on a Windows 10 machine ah windows windows try to clone the repository again it worked for me that's a that's such a um we want to change all the line endings to l f if config for new files but we still need to update old files I thought pre-commit did that no Docker image of course yeah we have Docker images but it's this is for con so if you commit uh pre-commit will yell at you if you don't follow its its thing um this thing it's very good like it just enforces style and so on and you really need that if you work with other people mixed line ending check replaces or checks mixed line ending um yeah yeah so uh let's see detect mixed line endings yep yep next line ending yep okay I think that's good I can't click on something here okay well in any case um maybe we can solve that yet for joining the Discord follow us on lion and uh or the or our Discord and I'll I'll paste the link as soon as I have figured out how to let more people in um I'm not I'm not sure um although I can probably there's not too many people here right um yeah I'll post a link there I'll I'll post it there for now for now GitHub is the main the main place where you can see what's happening or the open discords I'll I'll post a link um eventually there is it's not to keep you out it's really just because there's so so many people I have no idea what to do okay is this already in the line endings recommit ah pre-commit config mixed mixed line and it's in it's in why do we have huh okay um next line ending foreign Fix Auto Fix Auto fix no how do how do I change all um change all line endings to LF for a whole directory tree that's what I want those to Unix fairly straightforward what is this Linux command dos to Unix it's a utility can I have that unlock nope let's see you should use chat GBC to approve yes am I right by saying we're training an open source initiative chat bot for when chat GPT becomes paid again it's an alternative I would say and it's gonna be better yeah um is this a real project or a functional demo this is a real project uh this this is an actual thing okay we we have something now so let's go here let's go here and I am I am really bad at find commands so whoop foreign we probably don't need node modules to be converted right right uh lucid dreams posted about this open sources from the GitHub very cool um I haven't seen him around yet um I know he has his an implementation uh but I haven't I haven't seen him around yet so Lobby more people hello people hello hi very cool hello thank you very cool ah need the mouse okay people I'll do these cool people will have to wait like a tiny bit um cool and we're going to go into actual things to do excellent hello cool all right all right um let's see okay so we have a Discord bot for dust Unix try using to do it in parallel that would have been that would have been a good idea we can just we can do we really need to do we denote modules well they're probably not they're probably not ah I probably don't need the node modules and they have to exclude it somehow well what was the issue exactly no here no here n npm npm foreign T fix I can fix things in m npm um foreign okay so here's a bug we don't know where the bug is this is in this we call it to do make it a priority this is actually very urgent like this is this is blocking people on Windows let's call it high we're very careful with urgent hmm this is still going in the node modules it's actually good maybe that I'm not doing it in parallel because the machine is already Breathing heavily okay so we have this oh let me read the other other messages first that are going on back end okay yep okay so very good just more people coming in hello hi pyro I've seen you around our Discord that's very cool cool nice hmm yes we'll get open Assistant to every PC in 2024 if there are still PCS all right so let's get into the GitHub issues this is the this is the meat yes I am a manager now look at this this all issues all from open assistant okay let's follow up yes I can do that that's awesome that means I don't have to do something no thank you thank you for doing it uh try supervised fighting sure I can do that is look I mean that is perfect again fixed don't need to look at this download transcript of Khan Academy okay hmm so we always look whether stuff is categorized correctly this seems to be categorized correctly uh with the creative comments share a like license for videos be a problem might entangle the transcripts yep that's something we have to pay attention to I'm not sure how we should we should manage that we should just keep it in mind okay so if soon as someone actually does something here once we check in the code for this we will have to also make sure that we're aware of the licenses here but so far no one has done or um this person is working on it but no one has done something so I think that's fine all right these are my days now my days is I work for the startup during the day and then I go home and then I I do that um collapse texts that are too long in tasks yeah yeah with a small screen it may happen that the text proposed solution have a max amount of characters collapse the text even more is present um well sure [Music] um all right so yeah this is a front-end issue it's not part of the project yet we have a project board if you want to go look it's to do yeah it's certainly it's like a medium priority I would say it's a small task um and yes it's indeed a good first issue for someone who wants to get into into the project sounds like a normie job I am an army I'm I'm sorry you can transcribe videos using whisper but we are already complicating the project yes there's a lot of data Gathering going on that is not necessarily part of the first like of the MVP um the MVP is really just gonna be to collect human prompts yeah add a rest API endpoints to manage users to view and edit users user IDs okay let's buy Andreas that's very thorough same some endpoints Implement data okay good so this is back end it's probably not a good first issue because uh yeah it needs needs a bit of internal knowledge foreign did you check the new embedding from open AI I I did I got their their email that they're now better and cheaper so that's good they must feel a bit of competitive pressure what does the rest in rest API mean um you know I knew that at some point hmm what does rest in rest API stand for representational State transfer good James was faster yeah I I once had a a job interview um when I was still studying and I they had this interview day and I thought you know I knew a lot and then they caught me off guard they were like what does assets stand for in in asset transactions what does assets do I'm like oh what are we doing we are being a manager over here we're going to notifications we look at notification then we decide what to do Harvest Stock Exchange q and a data stack exchange after reading discussion how to build um um keep in mind that the main goal is task diver City we'd need to focus on that when scraping any Source but stack Exchange in itself okay that's fairly good there's no assignment yet that could be a good first issue I don't need to know too much about the project to investigate scraping data um that's a large task I would say they're describing always takes longer than one one things what are those noises it could be the fan it could be the fan of the MacBook I'll try to sit further away let's see do I have noise cancellation on here um I do have noise suppression or are there other noises I'm running experiments for supervised fine-tuning for 175 B model with adapters from school how should I organize the work should I join Supras fine tuning issue or another one uh I guess you can make your own if you do something very specific um yeah and and just I'm not sure if you can assign labels otherwise I'll assign it the the ml label uh currently I I did a user um called sotiris is is gathering a bit what's happening on the ml side because it's wild the ml side of this is wild yeah I just joined is there a tldr if you go to the repository linked above there is all the information you need they meant the chewing noise I'm chewing chocolate yes I I'm I need I need calories I haven't eaten yet today create a UI for messages API endpoints user Global message tree views that is crazy all these doing this I guess I hope priority that's probably uh it's probably a bit larger no well it depends who does it it would be large for me thank you tldr we are making Jarvis actually we are making Jarvis yes will there be a recording of This live stream hopefully if I don't say something insanely stupid foreign write a quick test Suite well we should probably not call this live stream coding because I'm not gonna I'm not doing a bunch of coding I'll I'll do some I if if I'm through the issues we can do some coding write the quick tests yep quickly test the branch yes that's very good we need some manual tests as well foreign to do testing is for some reason testing is the thing that people don't immediately grab almost any issue no actually I'm I'm wrong there are some definitely some some people who take testing issues basic red red is set up very cool so some some merge conflicts merging is blocked okay okay wait I need to do something here do you think it's necessary to find for removed at the back end label added redis linting prettier removed local volume cool all right so I'll re-review this look at that why does open AI specifically mentioned that they are not accessing the web for information Trio and that's obviously the next step open AI it seems I don't know but it seems like they're very concerned about PR negative PR which maybe maybe you know they are correct in being super worried super paranoid about this but they want to make sure that this thing not does nothing doesn't ever do anything that's out of the ordinary or could land them in hot Waters or anything like this now imagine this thing could actually surf the web you could probably steer it into a direction by making some kind of website putting something in there having instructing the thing so that it finds your website or something like this it's just it's just a lot more dangerous than a hermetic system um yeah but I agree things will go there do you have any opinion the S4 architecture I I think I think it's a viable architecture if you need longer range dependencies but I think the dependencies themselves are not as let's say powerful like they're not as Dynamic so I would guess it's a bit weaker than a the like attention um in terms of the complexity of things that it can do but it's better at incorporating longer range dependencies though uh-huh worried about the public perception yes check the Discord I will later it's I I could be there all day like I need to batch it um but you yeah it's really for work coordination you can be part of part of the project at any like the GitHub is a great place and so on okay so we have redis good well now this doesn't seem to work redis data but there is no volume anymore no set up redis yeah foreign here there's no volume yeah the ansible probably doesn't work but this one maybe let's check that out together so this pulls up or at this image always restarting which I guess is good Port exposing that's fine for now we have a firewall um checking that's good command server with a config file we Mount the config file at that place and we have an insights host that seems reasonable reasonable this is the config file now for the ansible we need a bit of we probably don't want this but we're gonna want the same we're going to want the config file um okay could you remove the ansible part so we can get this merged I'll take care of it late um can you also make a new or I'll take that myself and do it later because that's it's I don't think it's gonna work uh with the ansible part um yeah unable to run recommit all right we already dealt with that that sucks I think the project needs a better documentation I can do that create an issue in that and ask on that and ask people what I write there I think you can just go ahead and make a pull request and we can discuss in the pull request what might be good better and so on um uh I think we are we welcome most I mean we have to see that it's a bit streamlined right it's not like any additional documentation is fine because that also increases the noise and the amount of things people need to read but in general I think um more helpful documentation is is better so I I think that will go pretty well um foreign accessing the web could be implemented in the training process of course of course it can but we will have to do that in a second step the first step is really to get this chat assistant uh going um Google show that they could use information retrieved from the web yes absolutely you can you can do that see more negative reactions from people than I've expected about AI in General open S paranoia like a valid concern I guess especially like like The Regulators are are itching like the regulator fingers are like the European Union is on this as we speak you can always count on them to poop at every I'd like to completely poop the party of every new thing that comes across um Sam Altman made a comment gradually to not alarm Maybe also really like really convenient that that's what you know means they don't release the model and make more money okay um which bought permission document required in the Bots readme that's a very good issue very good issue for people who Wanna probably medium because we already have our bot but um yeah um right I want it this never shrinks does it this list never shrinks um okay we can stay with G zip Json line files we can into other formats easily foreign okay so I'll assign this person thank you very much for taking this on what are you thinking about as the base model um depends so we are investigating multiple things like T5 on the smaller side and like the really big models on the larger side but it's gonna come down to so the really big models the idea there would be to then distill them so yeah uh but the small models might also be helpful I'm I'm keeping one eye on something like uh code gen or however that's called like coding models because open AI said that their new DaVinci models are based on codex models which I find really surprising I want to build some basic assistant got some best options given that I'm limited to eight gigabytes well there's no it's not it's not going to be easy there's no straightforward way to build an assistant right now except like if you're on the chatbot for some very specific task you can do that but then you can just also program it no hmm um no worries that's cool um foreign to the Discord bot read me good adding pop over to flag text with labels that's very cool Fozzy is on it ABD is on it very cool people are on it nothing for me to do thank you very much I will add it oh no this is a pull request I usually add issues to the project board clean up all eslint warnings PR 199. excellenter warnings cool um let's assign there's a branch 199 0 nice I am a secretary I am a secretary this is actually it's cool I it's absolutely great that so many people want to contribute I'm more than happy to be full-time secretary for this more than happy I'm working on pulling transcripts you could to channel can do this for Khan Academy what channels do you want what format oh I I think people discussed this in the issue maybe something like karpati did for the Lex podcasts um but I think Rod like text files would also be fine since we're not really interested in coordinating it with the source material like with the pictures the same data set might be useful for a future project to train like some sort of a video clip or something like this but for us right now we I kind of we just kind of want the text um so up to you I think you mentioned the plan with large is to distill them is imperative that we steal them or do we consider offloading distributed we can consider we can consider many things distilling so um distilling means that you take a large model and you try to like transfer the transfer the capabilities to a smaller model so use the larger as the teacher model and the smaller one as the um student model and that usually works okay ish now if you yeah we can think of other things with the larger model but I really would like this to result in something that people can run in some way right in some reasonable way so that it's not hosted you know on some super cluster because once you host it sorry once you host it on some super cluster it also costs considerable money to do inference on it right and that means we can't just bleed money like it's very cool that people want to help here and I'm pretty sure with some donations we could also go somewhere but we can't just bleed money for inference and that means um that you know we'd have to start making it a paid service just to to cover the cost cover the maintenance and so on and we'll probably have to hire people to keep it maintained since it's now a paid service so people actually expect quality or reliability which makes the price even higher because you now have to pay these people it just seems easier to make a model that you can push out to people as the model and then they can find ways of running it yeah that's my opinion but um there's a new yet contrastive search I've seen that that's very cool you think all you our Normie computers are up to it yet no no the first iteration of this will not run on an army computer but it will be in the ballpark hopefully in the Box such that people with better than Normie computers can take it can play with it and can compress it and can make it run on more Normie computers and so on like the same that happened with stable diffusion foreign on an army computer I played with the end-to-end demo saw a bog on the page submit a rating twice very good bug report thank you for catching this very nice okay okay all right I've requested a review again from Andreas I hope that's gonna happen at some point our user to link their Discord account after registering with email quite a hassle yeah I get it allow dangerous email account linking is true um doesn't react have some property that says dangerously set inner HTML or something like this yeah um what kind of vram would it be able to run on I'm not sure yet we we don't know yet we're collecting the human data so I'm looking at the high level protocol architecture am I right to say the protocol would ultimately help in fine-tuning training as well as the actual assistant tasks the sure the protocol can help in the assistant tasks as well but it's really it's really for data collection for now um yeah also it's too complicated I I had I had a brain fart when I made this um I thought all the front ends should be okay very cool thank you for thank you for figuring this out okay there's nothing to do here I guess until someone picks it up is it possible these algorithms like deep sparse on language models like chat GPT would help lower the computational yes it would I guess just take longer you can trade off with some sort of offloading detox yes nice look at that pre-commit ran now we're good to go a big part of my job is to yell at people to run pre-commit it's a good tool but you have to install it once like you have to install it in your repo um I added my essay for data documentation of data oh sorry argumentation should have the augmentation argumentation oh okay turns an essay into ah argumentation it turns an essay into a set of instructions I also did some much needed work on docs added a file about data argumentations anyone can check what that means and how they can contribute I also move prompting guide to docs as I think it should be there makes it easier to find add another notebooks for essay revision very cool very cool very cool okay this does it have a label um I'll assign the label documentation even though it also has like notebooks actually documentation and ml thank you okay all right so where are we at data argumentation is a Technique we can use to get better data faster doesn't every that should that's I want that I want better data faster I can use this to get better data faster that's amazing using machine learning models to analyze long data compress it into instructions that's actually smart contribute we can write a short python script to use this model from hogging face to analyze the text or examples what you can do and example implementations very cool to contribute instructions that's a notebook let's say revision that's a notebook essay instructions is a notebook that takes an essay and generates instructions on how to generate that essay will be very useful for data collecting for the model I agree okay that all seems reasonable that seems reasonable cool um perfect perfect all right yes I have no comments perfect awesome thank you very much it's very productive okay pre-commits still failing let's see better data faster no song by Daft Punk yep all right prettier is not happy and the file fixer is not happy good could you run pre-commit run all files to make the linter happy all right now before yes we run something here let's see whether that ran that's good to merge merging nice perfect perfect amazing that's a merge pull request I love those use chat GPT to make chat jbt that that is the Singularity no fixing linter warnings okay I don't have to do something here I think checks have passed but code on review is required once checks have passed we can probably activate eslint itself not just next lint um supervised data that's a big issue uh-huh transcriptions mm-hmm better than nothing model we can use knowledge demonstration as context for human allotakers yes that's correct I'll read this I read this issue later because I don't really know what to do with it right now because it's a bit not really an issue write a bot to backend communication test the issue could benefit from a contract test you should test that each call satisfied the open API schema inputs an input that way changes the input contract automatically cost by this test to retrieve yes hard code the response model and endpoints for these tests creates more maintenance burden ef2 yeah put up the back and you'll need the database and redis foreign what would be the overhead tests if I change the API in the back end and change the Bots API client very accordingly such that everything works do I also have to change something in the account test if not that would be really cool and definitely good way to go if yes it's a bit more tricky as it might mean big overhead um also for I think we could use existing Docker compose setup there is a pi test plugin for this and I think we already have code two uh create some sample data okay it would be really cool like I don't want to discourage this this seems this seems like a very good very good way to go to do one of these con I've just never heard of contract tests so but if they really work with the schema then why not um Harvest stock exchange data for scraping online dumps that might prove useful here's an example thank you download transcripts of Khan Academy yes working on this for another project here's some code modified code from you'll also need to get all YouTube video IDs okay Google is constantly upgrading transcription I suspect all the pins it might be out of date quickly yes that is very good input thank you collapse texts that are too long that won't solve the problem if you have a fixed text with text wrap in two lines there will be okay I thought of something like overflow or hidden with L with text overflow ellipses or some like that tricky character limit seems fine what step are we on right now step I don't know what's step three what's that step step one is steel underwear step three is profit I I get I guess some of you might be too young for this one to remember this anyone having issue with the docker compose what's the issue with the docker compose yo what's step two um anyone having issue with the docker yeah the docker compose supervised Phantom check parameter efficient fine tuning for large models based on Discord conversation might be the training adapters from much larger models would be to result in a better model than training a small model fully that's correct how to check we already have an issue for supervised training so we can borrow a setup to test with two public models available yeah take a big model finding all parameters take a large model and train with the following methods and then check how does it affect the quality if the large model is better but e might be better to still Alternatives yes it's working extreme quantization very cool yeah Bloom is as far as I can tell oh you very much for the agree the plan is foreign we absolutely need to know know how much the between small and large all those is um my limited view opt seems or suitable of the two both because of the license I also believe performance wise it is a bit better correct me if I'm wrong nice yes I agree with the rocket okay um labels ml excellent projects this one this is I want to say well the priority given how much stuff I've assigned medium priority I want to assign this low priority which doesn't mean it's not to do it's just we don't necessarily need it for the first milestone um but I'll up Priority once it it comes closer to being important isn't Bloom 175d parameter model yes it's very big like your mama a Docker ah okay okay Docker compose the compose file is invalid what service backup is neither an image nor a build for wait wait wait wait wait wait wait wait wait wait wait wait wait wait wait wait wait wait wait oh well the doors to Unix ran and there are no changed files so okay so that issue must be something else there wait okay um Docker compose up build foreign all good all good no um do we uh no Docker compose yes it says unknown flag but yes you need I think someone here there's something with Docker compose run it without the dash so Docker compose with the dash oh no you ran it without the dash unknown parameter build unknown flag build the same person to complained here oh no no anyone having issue with there's something in Doc wrong and Docker compose well it's to to to um to get into danger of of uttering the meme but it works on my machine foreign [Music] but it does the creator of Ark thinks he's not a good programmer well what is a good programmer if you get stuff done and that's you're like good for many things and then there is the aspect of getting stuff done in ways that is future proof which makes it a bit better and then um there is also to write things in a way that are future proof and compatible with how other people write things that's kind of The Next Step I would say because writing code in a team is is different you have to make a few compromises um yes yes okay perfect yes that is amazing has to be different Docker and Docker composed versions yeah Docker compose as far as I know is now included in Docker so if you write it without the dash like Docker composers two words so yeah I'm I'm not sure what the problem is but here at least it builds yeah what's my Docker version compose is not a dog yeah it might be a bit old or I'm not sure so I'm on 2010-21 on a Mac Mac OS thank you update the prompting guide provide accurate reliable information using credible source and reference as appropriate avoid providing vague or incomplete response very cool yes yes yes yes yes that seems very reasonable thank you okay need to approve running pre-commit and I'm going to guess pre-commit will yell because it always does why need a team when you have large language models we can hopefully check compare the performance by testing both the suggestion on they also have yes agree this is what a typical Tech manager looks like okay unable to run just did a fresh copy hmm that sucks foreign from where hmm I think this is the problem WSL is using the windows npm you can install yes this might actually be this is a good this is a very nice analysis because pretty sure this means that in this file the line endings are maybe wrong you think human what beings are stochastic part yeah I think the most like the most um important thing that comes out of this whole language model research isn't necessarily how good the language models are but that probably humans aren't doing much more then interpolating things they've seen before so yeah I'm I'm fairly I'm fairly sure okay this is good merge it nice excellent this is a pleasure okay we have merge commit yeah okay now we can actually think of doing something so um Andreas has sent around so here in bot we can look at that we can work so let's go work complete the task no I don't want to type I just want to work check your DMs what look at this this is cool rank initial prompt rank the following tasks from best to worst okay what do I do now I accept the task huh please type your response here okay wait prompt one prompt these I mean these I'm I'm having trouble could you clean my room please that's the best prompt so five is the best definitely two is the next best one I'll just go by length one three and four then okay does that work task complete would you like another task yes of course I would like another task rank the following rank user reply and here it says rank the following task that's already a bug that's a book tasks rank the following tasks from best to worst I can file a bug I could just fix it but I can also file a bug um not the pull request an issue I want an issue yes new issue is rlhf system which automated from Discord the will that will not recollect the data with Discord and then we do the reinforcement learning uh on somewhere um but each human mind has a character sheet with different training history that's true yes but it it's it's still the case that humans probably just do that right they just interpolate what's your take on binary feedback versus preference ranking about binary feedback is also preference ranking but if you if you think of wait let me make this issue real quick um but task says e user replies but description says rank tasks okay this is a bug it's part of the bot it's probably a good first issue because I mean I'm not I suppose that it's not gonna be that hard um to do that is medium it's not a blocker and it's kind of it's tiny but still it's an issue good task time that what give give me come on give me task give me task okay I should be able to write work here no next task Maybe no it's just hanging um okay we can make an issue for that um I should be able to ask the bot for work in the DMs so far I can ask for work in the actually no because our main problem with the Discord bot is that we almost need you to go to the DMS to do the work because otherwise it's too much noise but we still want to keep the sort of community aspect alive so if people if people go in here and request work at least other people see hmm no they can't because it's deleting my messages okay well then it makes sense okay do we have okay now I should be able to ask the bot for work in DMs so far I have to go to the channel to ask for work but then it transfers me to the DMs if the task times out but I like another one I have to go back to the channel it would be more convenient if I could just also type work in the DMs right that does not that doesn't work no yeah I didn't do it okay all right we'll assign it but foreign and we'll do that we'll do that oh no to do it's probably not it's medium priority small task nice good why open vs code when I can get someone else to it well I have that I I watched the talk on managing open source and it really it set essentially that like even for the small stuff try to make issues to get other people just um because it's also a like a small issue is also a chance for someone who wants to participate but doesn't feel like oh I don't want to take on a big task right because then I'm like a blocker if I don't figure it out or maybe they feel they're not capable enough like small tasks are really good for getting people on board who want to help who want to do something um and you know just get them on board and and through the motion maybe they'll see a bit of the code and once they went through it once they'll feel more comfortable with the code base or at least the part that they saw um they have to they'll have to do a Dev setup to test it out maybe so I think it's yeah I think it's right to to to to to do that even the small tasks sort of give them away okay on it all right let's let's go back let's make some work check your DMs of course user reply send the next message in the conversation as if you were the user hi assistant hey buddy how can I serve you okay so this is a conversation um when do you think we can start data labeling hopefully next week like it's a bit ambitious but hopefully next week we're like we are full steam flamethrower open working to get this up uh yeah send the next message in the conversation as if you were the user um hey buddy how can I serve you um okay okay so accept please type your response here of where can I get pizza at this time of day task completed would you like another task yes okay rank assistant reply rank the following tasks from best this it still says tasks I would get I would guess that's that should go into the same the same book now hopefully hopefully also happens for reply to Doom good except I'm unsure oh rank assistant reply I'm not sure how to interpret we don't have the conversation I think that there's already an issue for we don't have the the conversation that happened beforehand we're supposed to rank these [Music] um but we don't know what the conversation before is so how okay let's just rank them like this task completed would you like another task yes rank initial prompt we already did that oh no this is different wait hmm um but hey um just trying out the bot I have already made a few issues on GitHub just a question is it possible no wait that goes to backend foreign the data that is used for the tasks is not correctly filtered for example I did a few ranking tasks and entered things like five [Music] which is fine but then I got more ranking tasks and I saw the string five um as one of the replies of the initial prompts right timed out well thank you um oh was I here one of the initial I was supposed to rank but just because I in that it's not initial prompt or any session message it's just what is this a bug or just the test data is random strings attach the picture this is in backend Channel because it's probably that back end issue okay got Docker composed to work nice good job at the end of 2023 Sam Altman has armies of capitalist Walker pots fighting our general ionic Bots invoking all the arrays versus the tensor what's your opinion on the forward forward algorithm by Jeff hint I'm halfway through the paper I'm getting through papers really slowly because I'm working on this so much but so far he he's hedging a lot right he's um he's saying well this is uh we've tried it on mnist it's just good for this and this and this we'll see um usually hidden has ideas and and small tests and so on but a lot of things work like as an idea and in small tests and then underwhelm when you scale them up so it remains to be seen which whether it's really like something new and uh viable or not um where do the weights come from we will we will train them we are going to train them um yeah okay work idioms rank the following from best to worst okay I mean this this generally seems to work no task completed okay what if I enter send the next message how can I help you go away okay yes okay what if I just do this invalid response nice so let's try one that is not in there two four five invalid respond nice good good good good good two three invalid response one two three four five tasks completed nice good wait we are going to get the weights at the gym obviously we're gonna get weights at the gym uh where's the training money come from you might like this might seem strange but money is not that much of an issue for this project there are enough people interested and enough organizations who might want to sponsor try extra comma at the end comma oh yes good job good excellent excellent idea more aha accept hopefully it's a ranking task the initial prompt ask the assistant initial prompt um oh right and email to my landlord [Music] asking them to reduce rent good rank nice so this this like this oh no wait I need to accept okay invalid response see it should say why it's invalid in nice clear language that might be a bit harder because we uh probably validated it against some some regex and it's it's hard to say but yes I agree that might be might be ux what about negative one and zero negative one would do negative one zero negative one one zero two it works it works select star from and zero see duplicate numbers we we tried that nope will it be named Chad GPT nice uh what about one plus zero two invalid response one comma comma two invalid response look 0.5 and no 1.5 and 2. it works it actually works you can't you can't fool it null two no no can't fool it zero on zero two God responds drop table let's go let's go drop table drop table invalid response see doesn't work zero zero one two this is Agi you can't fool it try longer response one comma two response uh 10 e minus one two no uh we could just go look up what the code is but so far I think yeah one two three four five seven eight nine no no we tried longer ones we we did it through nine we can't you can't just put in a reg X into regex validation no no no we're good we're good we tried every every possible thing do you know this joke there's this joke that a QA engineer comes into a bar and asks for one beer asked for 99 beers ask for 9999 beers asks for Na

Original Description

Chatting & Coding

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Yannic Kilcher · Yannic Kilcher · 0 of 60

← Previous Next →

Imagination-Augmented Agents for Deep Reinforcement Learning

Imagination-Augmented Agents for Deep Reinforcement Learning

Learning model-based planning from scratch

Learning model-based planning from scratch

Reinforcement Learning with Unsupervised Auxiliary Tasks

Reinforcement Learning with Unsupervised Auxiliary Tasks

Attention Is All You Need

Attention Is All You Need

git for research basics: fundamentals, commits, branches, merging

git for research basics: fundamentals, commits, branches, merging

Curiosity-driven Exploration by Self-supervised Prediction

Curiosity-driven Exploration by Self-supervised Prediction

Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

Stochastic RNNs without Teacher-Forcing

Stochastic RNNs without Teacher-Forcing

What’s in a name? The need to nip NIPS

What’s in a name? The need to nip NIPS

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

GPT-2: Language Models are Unsupervised Multitask Learners

GPT-2: Language Models are Unsupervised Multitask Learners

Neural Ordinary Differential Equations

Neural Ordinary Differential Equations

The Odds are Odd: A Statistical Test for Detecting Adversarial Examples

The Odds are Odd: A Statistical Test for Detecting Adversarial Examples

Discriminating Systems - Gender, Race, and Power in AI

Discriminating Systems - Gender, Race, and Power in AI

Blockwise Parallel Decoding for Deep Autoregressive Models

Blockwise Parallel Decoding for Deep Autoregressive Models

S.H.E. - Search. Human. Equalizer.

S.H.E. - Search. Human. Equalizer.

Reinforcement Learning, Fast and Slow

Reinforcement Learning, Fast and Slow

Adversarial Examples Are Not Bugs, They Are Features

Adversarial Examples Are Not Bugs, They Are Features

I'm at ICML19 :)

I'm at ICML19 :)

Population-Based Search and Open-Ended Algorithms

Population-Based Search and Open-Ended Algorithms

XLNet: Generalized Autoregressive Pretraining for Language Understanding

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Conversation about Population-Based Methods (Re-upload)

Conversation about Population-Based Methods (Re-upload)

Reconciling modern machine learning and the bias-variance trade-off

Reconciling modern machine learning and the bias-variance trade-off

Learning World Graphs to Accelerate Hierarchical Reinforcement Learning

Learning World Graphs to Accelerate Hierarchical Reinforcement Learning

Manifold Mixup: Better Representations by Interpolating Hidden States

Manifold Mixup: Better Representations by Interpolating Hidden States

Processing Megapixel Images with Deep Attention-Sampling Models

Processing Megapixel Images with Deep Attention-Sampling Models

Gauge Equivariant Convolutional Networks and the Icosahedral CNN

Gauge Equivariant Convolutional Networks and the Icosahedral CNN

Auditing Radicalization Pathways on YouTube

Auditing Radicalization Pathways on YouTube

RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Dynamic Routing Between Capsules

Dynamic Routing Between Capsules

DEEP LEARNING MEME REVIEW - Episode 1

DEEP LEARNING MEME REVIEW - Episode 1

Accelerating Deep Learning by Focusing on the Biggest Losers

Accelerating Deep Learning by Focusing on the Biggest Losers

[News] The Siraj Raval Controversy

[News] The Siraj Raval Controversy

LeDeepChef 👨‍🍳 Deep Reinforcement Learning Agent for Families of Text-Based Games

LeDeepChef 👨‍🍳 Deep Reinforcement Learning Agent for Families of Text-Based Games

The Visual Task Adaptation Benchmark

The Visual Task Adaptation Benchmark

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning

AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning

SinGAN: Learning a Generative Model from a Single Natural Image

SinGAN: Learning a Generative Model from a Single Natural Image

A neurally plausible model learns successor representations in partially observable environments

A neurally plausible model learns successor representations in partially observable environments

MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions

Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions

NeurIPS 19 Poster Session

NeurIPS 19 Poster Session

Go-Explore: a New Approach for Hard-Exploration Problems

Go-Explore: a New Approach for Hard-Exploration Problems

Reformer: The Efficient Transformer

Reformer: The Efficient Transformer

[Interview] Mark Ledwich - Algorithmic Extremism: Examining YouTube's Rabbit Hole of Radicalization

[Interview] Mark Ledwich - Algorithmic Extremism: Examining YouTube's Rabbit Hole of Radicalization

Turing-NLG, DeepSpeed and the ZeRO optimizer

Turing-NLG, DeepSpeed and the ZeRO optimizer

Growing Neural Cellular Automata

Growing Neural Cellular Automata

NeurIPS 2020 Changes to Paper Submission Process

NeurIPS 2020 Changes to Paper Submission Process

Deep Learning for Symbolic Mathematics

Deep Learning for Symbolic Mathematics

Online Education - How I Make My Videos

Online Education - How I Make My Videos

[Rant] coronavirus

[Rant] coronavirus

Axial Attention & MetNet: A Neural Weather Model for Precipitation Forecasting

Axial Attention & MetNet: A Neural Weather Model for Precipitation Forecasting

Agent57: Outperforming the Atari Human Benchmark

Agent57: Outperforming the Atari Human Benchmark

State-of-Art-Reviewing: A Radical Proposal to Improve Scientific Publication

State-of-Art-Reviewing: A Radical Proposal to Improve Scientific Publication

Dream to Control: Learning Behaviors by Latent Imagination

Dream to Control: Learning Behaviors by Latent Imagination

POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Solutions

POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Solutions

Evaluating NLP Models via Contrast Sets

Evaluating NLP Models via Contrast Sets

[Drama] Who invented Contrast Sets?

[Drama] Who invented Contrast Sets?

This video teaches how to fine-tune and deploy large language models using various tools and techniques, with a focus on open-source collaboration and data protection. It covers topics such as retrieval augmented generation, diffusion models, and chat model development.

Key Takeaways

Fine-tune a 175B model with adapters
Scrape data from Stack Exchange
Build a UI for messages API endpoints and user global message tree views
Create a test suite
Do manual testing
Use FAISS with cosine similarity on 768-dim embeddings for sub-100ms retrieval
Run pre-commit for code management

💡 The video highlights the importance of fine-tuning and prompting for large language models, as well as the need for open-source collaboration and data protection.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Engineering

View skill →

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Shane | LLM Implementation

How to Make an Asteroids Game Bot (LIVE)

How to Make an Asteroids Game Bot (LIVE)

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Automata Learning Lab

Related AI Lessons

The 2026 AI Model Release Race: Every Major LLM Launch You Need to Know

Stay updated on the 2026 AI model release race, including major LLM launches like Claude Sonnet 5 and GPT-5.6, to leverage the latest advancements in AI technology

Call GPT, Claude, and Gemini from one API key — a 3-step setup

Access GPT, Claude, and Gemini through one API key with a 3-step setup using Modelishub

Your LLM Doesn’t Pick Stocks — It Remembers Them

Discover how LLMs remember stock picks rather than making actual predictions, and why this matters for AI-driven investment strategies

Medium · Machine Learning

Word Representation

Learn how word representation works in NLP and its importance in understanding human language, enabling applications like text classification and language translation

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)