Finetuning a Creative Writing Coach in GPT-3 - Part 2

David Shapiro · Beginner ·🧠 Large Language Models ·4y ago

Skills: Fine-tuning LLMs95%LLM Foundations90%Prompt Craft80%Multimodal LLMs70%

Key Takeaways

Fine-tuning a creative writing coach in GPT-3 using various techniques such as prompt engineering, neuro-linguistic programming, and data combination to improve the model's performance and consistency. The video demonstrates how to use GPT-3, Python, and other tools to fine-tune the model and achieve desired results.

Full Transcript

what's up everybody david shapiro here for another video before i get started i wanted to um just show you guys some numbers um and this is just i'm still in awe i don't understand why this is happening but it is so like view count is going up exponentially watch time is going up exponentially and subscribers are going up exponentially so whatever i'm doing you guys like it so i'm gonna keep doing it um also audience um there was let's see where is it uh age and gender okay so this is fascinating to me um it's roughly two-thirds male one-third female it used to be a little bit more balanced but it's becoming slightly less balanced but most of the audience is younger than me i'm 36 and so most of y'all are younger than me so it's like this is great this is exactly the people i wanted to reach why because y'all are the next generation whatever i don't finish you all have to pick up the torch and carry so anyways with that being said we are going to finish our um our creative writing coach today so i started this um because this was an idea from reddit where someone said hey it would be really great if you could have a a chat bot or or something that provides like professional feedback about like tone and style not just like um not just like correcting grammar and stuff and i was like got you got you covered so let's do a quick review about what we did last time um here let me just go to the folder of our creative writing coach okay so the first thing that i did was you need data right you need data when you're when you're doing this kind of thing you can generate synthetic data you could ask gpthree to generate um lots of uh lots of uh like fake stories and you could even ask it to do errors um and uh there's a problem with this and i don't fully understand the math but part of the problem is that these large language models tend to kind of go towards the average and so like and people on the on the on the community forum have noticed this that it's like it's kind of the average of what humans produce but what gpt3 produces is still slightly inhuman and so you're going to get much more variance much more variability when you get data from real humans so i just wrote this quick script to download a bunch of stories um from uh from reddit because there is a subreddit called writing prompts actually here let me just show you reddit.com slash r slash writing prompts there we go okay so let me show you what this what this looks like in case i don't know if i showed you last time so basically what someone does is they post a writing prompt and then all the top level comments except for the first one are stories and they they're they vary in length some of them are just short poems some of them are like novellas basically um and so this is a great place to get training data for this so what i did was i wrote a script that um that gets all the that uh well first it gets all the top posts from that subreddit so it says get the top for the last month so this little string here it says get everything for the last month but put it in json format so it's data um so this is what it looks like but if you remove the json part so the dot json you remove that and it's just t month this says okay top of everything for the last month so i did that because the top ones are going to have more responses and so you see like you know there's 193 comments 162 comments but we only wanted the top level comments so we get the top posts of the last month so that's this page and then from then from within each of those um we get the comments and so what that did what that does is we say okay so comments we get the first layer of children and then we say for comment and comments we get the data and the body of that comment so we didn't recurse through because we don't want all replies we only wanted the top level replies so that's like um it got it would get the bot so we'd get rid of the bot um and but it would get like this one which is a story it would get this one which is a story so on and so forth so that's what we did so we downloaded um after cleaning it up 388 stories of varying lengths excuse me the longest one is i think i already deleted some of the ones that were too long there was ones that were like 18 kilobytes but this one is 10 000 characters long so that's um that's that's probably a little too long let's see how many tokens that turns out to be so 10 000 characters almost exactly 2 300 tokens okay so that can fit in the current da vinci model the text o2 because the maximum length is four thousand tokens but yeah so it's it's roughly four to one because ten thousand um ten thousand uh characters is twenty three hundred tokens um okay so i'll update my my ratio because i had it as three to one so it's roughly four to one okay so that's where we're at um that's what we did last time let's see the other thing that we did was i started working on the prompt so um let's see we had what was prepare prompts generate completions i don't think i got this far yet yeah prompts and whoops ah come back slow down and completions okay we did try and generate some completions but i didn't like them that was the problem okay um yeah so that this was as far as we got last time where um now now that i'm up to speed let me show you um so here's a short script that just you know generates the completions excuse me um and so i ran a few and i was i was kind of not happy with it because it tended to um to give uh i mean the advice was good enough but it tended to give it in a list and i'm like i don't really like that like um i want i want to see my feedback um written like you know uh written more like a paragraph um so let's say write one or two paragraphs um okay so we're at the point of prompt engineering um so what i've written here is adopt the persona because once you tell gpthree what it is you say like you know i am this um adopt the persona of a professional creative writing editor we can probably even shorten this and simplify it and just say i am a professional creative writing editor read the following story and provide me okay because i wanted to write you so um yeah actually i think we do have to stay adopt the persona of because what another problem with the feedback that it gave is that it says um uh where it kind of talks about the author and third person but i realized if you're using this tool you want it to be giving feedback to you where it's like it's not talking about your work kind of in an abstract third-party way you want it to be saying you did this do this better right so i want it to give feedback to me so we have to be you have to be very mindful of point of view so as a as a fiction writer like i'm very aware of point of view but also understanding point of view and how large language models handle point of view is really critical for developing acogs artificial cognitive entities so we'll get into that in another video but point of view critical so adopt the persona of a professional creative writing editor so that's that's saying that's the model now thinks okay i am a creative writing editor right so that's its point of view read the following story and provide me detailed feedback so it says this is what i am and i'm talking to you so you're framing the whole conversation to improve the pros hold me to the highest literary standards feedback should be let's see your feedback should be open-ended and include examples or suggestions and then i added some framing here the reason that i did this is because sometimes it got confused and would just continue writing the story um so i like add story like story starts here then you add the story and then end story um and then i added a bunch of white space actually let's just do two um because if you if you use just a single line of white space sometimes that just looks like a paragraph break but if you do like two or three that that clearly signifies like in your brain it says oh this is a whole new section but also gpt3 learns that that means it's a new section as well because from from uh from the large language models perspective it doesn't actually see space it just sees characters so it sees that it sees slash n slash n slash n or sometimes it's going to be slash um rn slash r slash n which slash r is carriage return so that brings the cursor back to the beginning and then n slash n is newline um and so if it sees three slash ends it says okay this is this is like a whole break whereas if it just sees two it's like oh that's just a new line but in this case when i have three vertical white spaces that's actually four four new lines total um so that just kind of gives it says okay this is a new section what am i going to do okay so then the final instruction one thing that i found for especially for these instruct series prompts is that if you if you give it the instructions at the beginning and then you reminded of the instructions at the end you tend to get really good results so you'll notice that this is my standard format where it's like i kind of give it the framing this is what you're about to read this is a story this is what i want you to do so because what that does is it primes the model because there's there the model so in case you don't know this about gpt 3 and large language models they have an internal state that internal state is cued up by by the prompt and so that internal state is represented by an embedding and an embedding is well one way to represent it is um as a vector which is uh embeddings are vectors it's just a particular kind of vector it's a long series of numbers and so what you're doing is you're you're charging it up you're priming it to have the correct internal embedding and it's just the same as what happens to a human like if i give you instructions i say imagine that you're making a peanut butter and jelly sandwich i gave you instructions and now you have an internal state in your head and so gpt3 is no different so we have to have a theory of mind to understand gpt three's mind right and because humans have a theory of mind we can anthropomorphize large language models because it's like okay imagine that you just randomly grab someone off the street and you're giving them instructions that's how you have to write gpt3 prompts okay so now that i've explained why it looks the way that it does let's give this a quick test so let's delete these they're already saved in github so i don't mind deleting them um and they weren't the best anyways um okay so then generate completions so basically oh and one other thing that i need to i need to share is um what i've started doing is breaking up the process into smaller and smaller steps and so the prompts i preload the prompts so that i can just feed them into gpt3 one at a time later and uh and so that's that's fine um let's see so but then we need to also prepare prompts um so by breaking it into smaller steps um i'm just saying like okay read these make it and then i added this little bit here where it will um one thing that i noticed that the the if the story was too long and it was cut off the um and but the uh but the the my my creative writing coach didn't realize that it said this story ended abruptly and i'm like oh yeah that's because it got cut off so if we add something to say like story truncated due to length the creative writing editor should know like oh okay i didn't get the whole story but what i did get is you know good um okay so let's run that real quick cd and we're going to the creative writing coach python prepare prompts this just runs in a second or two so let's go to our prompts folder you can see these were just updated oh sorry my dog's outside i need to let him in be right back okay and we're back sorry about that um okay so oh yeah we just prepared the prompts here let me close some of these extraneous tabs uh go away cancel do not save okay too much noise all right prompts so now we've updated the prompt where it says adopt the persona of our professional creative uh writing editors so on and so forth your feedback should be open-ended and include examples or suggestions um excuse me now give me detailed professional editor feedback um with suggestions and examples to improve pros write one or two paragraphs so i made a joke in youtube comments that that basically you have to use neuro-linguistic programming with gpt-3 so neuro-linguistic programming was invented in what the 60s by a psychiatrist and it was basically like the way that you speak and the way that you frame things to yourself will change the way your brain works um and that in in terms of humans it has been largely discredited the most popular proponent of neuro-linguistic programming is tony robbins who has been on the ted stage so make of that what you will um but uh basically like if you say i am an expert rather than i am an idiot right however you frame something might change your cognition um but the the funny thing is is how you frame something absolutely changes the way that gpt3 thinks about it because if you say if you if i open this with like i am an idiot like gpt3 will act like an idiot if you say i am an expert i am a creative writing expert it will act like it um so actually that makes i wonder if we should update the prompt again now that i'm talking through this um i'm telling it what to do okay um yeah cause it's so it's hard because if you like it's a matter of who is who is me right if you use i and me um but i say i'm i'm telling it give me feedback so um yeah so i'm telling you what it is but i'm wondering if it will be better i need to read the following and give you feedback i think i think there's a better way of framing this let's say i am a creative writing expert um i have been a professional editor for 309 years 30 years um i am going to read the following short story and provide you um detailed feedback to improve your pros i will hold you to the highest literary standards and my feedback will be open-ended um and include examples or suggestions okay so by framing it this way because remember gpt3 was trained on writing um so i'm basically i'm just saying like okay imagine that a helpful redditor is writing this so if a redditor just like introduces himself i will read your work and i will provide your feedback so this is what they might say so i'm wondering if framing it this way will be even better okay so then because also this we're cultivating in this it is adopting an identity i'm not telling it to adopt an identity i'm just i just through neuro-linguistic programming this is what i am this is what i have done okay let's see i will now give you several one or two paragraphs of detailed feedback to improve your pros and style okay this is cool i wonder if let's let's see how this works i've never written a prompt quite like this but now that i'm talking through it i think that this will be pretty good um okay so let's do the prepare prompts again um let's make sure that they got updated with the new format i'm a creative writing expert excellent i will now give you one or two paragraphs of detailed feedback to improve your pros and style okay so it should it should talk to you right um no don't uh go away whoops don't save okay so now now that we've got that we're gonna do python and generate so we're going to generate some completions and we'll see how it goes and i'll just pop them open as it's running i'll let it generate like five and then i'll cancel it okay it's still doing the list sometimes okay so one thing about the instruct series is that they are they're very finely crafted to do lists they're really good at generating lists i really wish that it wouldn't do that though because it looks like two of these three were lists yeah come on i said generate paragraphs okay this one is good one issue i see in this story so this is this is the feedback that it would give you so um to the dude who our girl who asked for this on reddit this is what we're aiming for one issue i see in this story is that there is a lot of telling rather than showing for example in the second paragraph the author tells us that kate is considering abandoning her inheritance rather than showing us through her her actions or thoughts in the third paragraph the author tells us that the factory is full of shadows rather than showing us through description oh that's good that's really good that i mean i'm a member of a writing group of a feedback group this is perfect maybe i'll maybe maybe what i'll do is i'll just i'll uh i'll i'll i'll take my my group members stories and plug it into this and just read this to them another issue is that the story is somewhat choppy and disjointed there is a lot of description but it is often abrupt and does not flow smoothly from one sentence to the next finally i would recommend using more active uh and precise language throughout the story for example rather than saying kate fumbled with her phone for a flashlight the author could say kate search for search your phone for a flashlight um okay either way this would make the writing more lively and engaging that's possible okay you have great opening sentence that really draws the reader in i love the description of the statue turning to rust it's really evocative and sets the scene well the dialogue between the entity and cyrus is great it's chilling and really gets the reader invested in the story the final paragraph is a great way to end the story um okay so we want it to be more critique um that's good you you do want to provide positive feedback um okay so we need we need to do a little bit more prompt engineering um okay so let's open the prompt um i will now give you one or two paragraphs of detailed feedback i wonder if it's the detailed feedback uh let's so so let's change um detail to critical because i i suspect that that keyword detailed because when you ask for details it's like oh i need to list it out um so but if i'm saying one or two paragraphs of critical feedback to improve your pros and style you know let's let's do this i also might switch to an older model because the original instruct beta tends to be a little bit better with creativity so let's try this and if it if it does the lists again we'll go back to instruct beta and see how that works um okay wait wait wait wait wait cancel i need to do the uh prepare prompts cd creative writing coach uh python prepare prompt um check to make sure the prompts look right delete the completions i will provide you critical feedback yes critical okay cool and then oops um python generate okay it's still doing lists let's give it a couple more completions okay it is super fixated on doing lists but the the feedback is much more detailed okay it seems like it's kind of bouncing back like it's it's still giving me numbered lists but they seem to be in more complete paragraphs so let's take a look at these whoops um completions sorry i was like what am i looking for okay so that's a paragraph that's a paragraph that's a pair okay so these are just numbered paragraphs it's kind of gross but we'll go with it um because sometimes it gives you just like yeah and it's so this one about kate about being you know be vivid and detailed show us show us show us so this this feedback is very consistent for this particular story um [Music] um oh i thought of something else so one thing that you're supposed to do with with feedback especially creative writing feedback because it's so personal is there's the compliment sandwich so you're supposed to open with a compliment and close with a compliment um which is basically like this is what you did well um and and then you provide the harsh part in the middle and then you say but also i really liked it right so let me add let's see if we can get that um provide you critical feedback uh um um i will also commend you where you did well um so that should be good i will now give you one or two paragraphs of critical feedback to improve your pros and style i will also use the compliment sandwich method of feedback okay so let's see if it let's see if it can follow that all right so we do cd whoops not that one um python prepare prompt python python generate completions and first let's go delete the ones and clear those out delete them it'll overwrite them because all it's doing is it's giving the completion the same file name as the prompt but i want to delete it so that way i don't accidentally get confused alright so make sure the prompt i'll also commend you where you did well okay compliment sandwich because we want this to be good all right generate completions the other because the thing is like if you're gonna so this is this is while this is running um a cardinal rule of thumb oh that's good that looks good okay a cardinal rule of thumb with automation is yes it might feel tedious to like get into the details to make sure it's perfect but once you do automation right okay we've got a few once you do automation right it doesn't matter if you're doing it 100 times or 10 million times if you've got your automation correct it's infinitely reproducible and so that's why i'm so meticulous in these videos and that's why you notice like most of the time is just getting the prep like laying the groundwork because once you get the right prompts once you get the right data that's most of the battle one after that you just let it run and then you fine tune your model and it's good okay so i think i think we've i think we've nailed it let's see i think we got three for three um let's see this was the prompt so we can close that one um let's see the first few sentence for example the sentence still in the high of graduating together with masters from could be rewritten as still writing high the story has good potential but there are few areas that could be improved for example the dialogue could be more natural less stiff additionally the story could be proof read for grammar errors overall this is a strong story with good potential that's not too bad very strong immediately sets a tense and suspenseful tone um i love the descriptions however i felt the story lost a bit of momentum in the middle when kate is exploring the factory might help to focus on one or two key images or scenes rather than trying to describe the whole factory in detail the ending of the story is very effective and the image of the door slamming shut is particularly chilling i think he did a great job building suspension creating a sense of unease in the reader yes so the reason that you also want to commend an author for what they did well is because it's not it's not always obvious what they did well and so when you when you tell someone you did this good it tells them keep that right keep that don't throw that out but fix this other thing so you have to you have to do good and bad right you have to label label the feedback as both good and bad the first thing i noticed was that your story has excellent potential but there are few areas that could use improvement for example your use of description could be more concise and vivid in particular i would recommend using more concrete images and specific details to bring your setting and characters to life additionally your dialogue could be strengthened by adding more natural sounding conversation and making sure that each character's voice is distinct that's a big problem that a lot of authors have including myself overall i think that if you focus on these areas your stories will be even more engaging and enjoyable to read perfect okay i think this is good we're three for three i like this so let me just do a quick git commit so we can preserve everything we've got here get status get add all git commit am really in love with this prompt and results um running the full thing now okay get push and uh yeah okay so now let's do um python generate completions so we'll let that run in the background there's another script that i need to write um which is actually preparing the json or i need to fix it so i'll do this while the rest is running um and the re so uh for file and files so i actually got this wrong because instead of prompts um i don't need to i don't need the prompt because then we're just we're doing the prompts i need the stories so stories so we'll have the story the completion so the prompt is going to be the prompt plus slash n actually let's change this to story equals and so then the story is so the kind of clarifying open file stories so we're going to get the story then the prompt is going to be the story plus new line new line and then what i usually do is do the tags do the um the demarcation tag in all caps because then it's very obvious that like this is the end so professional feedback and then add a space and then the completion will be space plus completion um and i think that should be it creativewritingcoach.jsonl um okay let's make sure this is running yes good okay um yeah and then the last thing that we're going to need is the actual fine-tune script so let me copy my fine-tune script from another one fine tune dot pi whoops not recursive summarizer creative writing coach and then we'll edit this with notepad plus plus and we're changing this the name is the creativewritingcoach.jsonl coach okay so that should be good this is going to be a lot of data also um one thing that might happen is if this bombs on any of the um it shouldn't because i have i have the the prompt in there limiting the length so if you look here in the prompts the maximum size is six kilobytes which is um six thousand characters so you can see that um like it is it is very much constrained because one of the biggest risks is if it's going to be too long that can cause it to fail and what i don't have in my script right now i should add this is because if i get a bad if i get a bad output or a bad response this script will bomb in the middle and then i'm going to need to write something to just skip the ones that are already done um because you don't want to regenerate everything um yeah because well no i do have a retry thing in the gpt3 completion where is it the generate completions so right here where i have while true try and accept so this will catch some errors um so it won't bomb but if the prompt is just too long it'll bomb five times in a row and then exit that that loop also if there's any like formatting issues that cause errors and file handling that will also cause it to bomb but so far so good so this will take a little while to finish we've got 388 stories um to do and uh we're sitting pretty at 19 so this is going to take a little while to finish so i'm going to pause the video and then we'll get back once it's time to do the um the data prep and fine-tuning okay not all is well in the valley um i'm watching it run and there's um i noticed that some of the completions are big because you see some of most of them are one kilobyte right so you take a look at it your story has a lot of potential but there are some areas that could use some improvement etcetera etcetera that's fine but then let's look at the bigger ones um so the first one that's two kilobytes says the complement sandwich is a method of blah blah the following constructive like okay i didn't ask it to tell me what the compliment sandwich was um uh so let ba in this case all we have to do is just delete um what we don't want to see that's fine this one the opening of your story that's fine first i will commend you on something you did well in the story so it's like yeah we already did that um i think i know what i can do though i think if i change the prompt um feedback so let's do that um that should fix it because now it's just saying like okay this is what i'm gonna do feedback feedback time um and since my script so the reason that i can do that i've done this deliberately is that in my in my scripts i have it um so see for up for each file it opens the prompt um oh wait no that's not going to work right because i separated the process darn oh actually i know what i can do i updated that so we can get sneaky creative writing coach python prepare prompts so now all the prompts are going to be updated behind the scenes so there we go now that should be fixed okay so because if you're in the middle of a long run and you see some some things you don't necessarily want to interrupt it because if you only have like 5 or ten things that you want to fix before it finishes doing 300 right you don't want to you don't have to stop and restart so got that fixed in i all i did was do this for the prompt because what it what it did was it it queued off of this and just continued explaining i was like oh okay um it didn't with the semicolon or sorry with the colon it didn't realize like okay feedback starts here but now i just made it explicit feedback starts here like this is what i'm gonna do feedback um okay so get rid of that so this is an example of a good one you did a good job of setting the scene okay one suggestion one suggestion the opening of your story is strong that's good first i want to commend you okay so those are ones that i had already fixed um so let's go back to completions and sort by size so let's open all of them that are two kilobytes or larger because those are the ones that are more likely to be problematic okay the opening of your scene is story is good okay that's fine the opening is strong that's good it added some dashes don't need that proxies new okay so this this one where this is an example of what i said where i just continued the story for some reason um so i have no idea why that happened um gpt3 just got um it it came up with its own idea so these ones you know like praxis new without even the smallest nagging doubt the former heart of industry seems so fragile in the dark um oh okay wait this is a great opening line the metal once revered as the backbone of american society this is great description in the setting um okay this is this is copying it too much i don't want it to just like call each thing out um it's what it's probably doing here is copying the style of reddit where you would quote something and respond to it directly we don't want that style though so we're gonna we're gonna hold this one open because we're gonna change that the treaty was a fragile one yep and this is just repeating the story back to us this one is the story he wrote again so this is also the quoting thing positive i like the opening yep okay so we don't want this this file um the complement sandwich that's fine so we'll just delete this say the opening paragraph is very evocative it sets the scene well overall this is a strong story but with a few revisions it could be even better opening scene is very effective okay opening paragraph is well written in general okay i don't like the format of this one i guess we can close this since this is good that's right i need to close them if they're if they're if they're correct um your opening paragraph is good overall interesting characters so basically what i'm going to do is i'm just going to keep the ones that need to be repeated keep them open that one's good and that one's good okay so we've got one two three four five six seven that aren't good so when you have this much data you often don't need to fix them you just delete the bad ones um so what i'll do is i'll wait for this to be completely finished um if we're at 107 out of out of 388 let's see let's see how much this is costing because some of you have questions about token cost so each of these completions um 25 cents 20 cents 22 [Music] 36 this is actually going to be pretty expensive okay that's fine how much am i at already ow oh this is going to be a really expensive job this is why i was waiting so i started this project um a while ago and um and i was like i'm about out of tokens for the uh for the month so i'm gonna wait till june so we're on june 1st and um yeah i'm i'm vernon tokens so they're on average this is about the first what is that um 90 or so so 90 cost 6.23 cents so 388 divided by 90 times six dollars and 23 cents so this is going to be 27 just to generate the fine tuning data um this is another reason why i truncated it on length um is because it's like i don't want to spend that much money um but it'll be good we'll get good results this way also let's see a um a 1.2 megabyte fine-tuning job was like 40 this is probably going to be bigger so this is going to hurt how much is this already oh that's not so bad okay maybe this will be a smaller fine-tuning job the uh the um novel writing one was killer because there was there was um 200 samples and each prompt was really long let me show you how big that was um auto muse so the um the data here 1.1.3 megabytes um so it's total 1.3 megabytes that was on let's see 523 was when it was last edited so let's go to june and i can show you how much that was um because i know they're just doing this while while it's running and answering questions that you guys have had um let's see 523 come on let that load let's see how much do we have now um may may 23 fine-tuning one request so the novel rider at 1.3 megabytes was a 38.61 fine-tuned job um so that's just so that you you know like how much this costs so um if uh let's see let's see what how much data we're at right now for the creative writing coach so we're 128 items and that is um the size is 115 kilobytes okay so that's like roughly a third so let's see how many how many files did i 128 okay so 388 divided by 128 so yeah almost exactly a third times 115 kilobytes so that'll be 348 kilobytes total if it averages out um so and then uh let's see this one was um it was like 1300 kilobytes so 1300 um actually no crap uh 348 divided by 1300 so this is just over a quarter of the size so we s we multiply that then by what was it and 38.61 cents so we'll expect this fine-tuned job to be about 10 so generating the the um the training data is very expensive but hopefully the fine-tune job will be much cheaper i could be wrong about all this we'll see how it turns out anyways we're about a third of the way done so i'll go ahead and pause the video again um do a quick refresh yeah so you see how there there haven't been any there haven't been too many more additional large um large uh completions yep you have a great imagination good i don't like that one i don't like that one yeah so all right so what we'll do is um some of these some of these are salvageable we just need to clean them up a little bit but then once we do that we'll delete the ones that we don't like just because again we've got a plethora of data if if something's not good just delete it because we want to have that consistency one thing that happens with fine tuning is that you can you can kind of smooth out rough edges because it'll take the average like if you have a handful of aberrations in the fine tuning data it's okay um which is why i often don't check the fine tuning data but in this case i wanna um just show you like this is one thing you can do to clean it up to get more consistent responses okay i'm gonna pause the video now and we'll um we'll be back in just a second okay we're back um i chickened out because this was getting expensive but we do have 202 completions i've already taken the liberty of deleting the ones that were too big too long or didn't follow the correct format so this is one example of so because you know if you're an expert in fine tuning or if you're familiar you might be like dave if you're just going to find if you're going to use one prompt to generate all of your data what's the point of fine tuning the point of fine tuning is that you can look for those aberrations and remove them and clean up the formatting so that you'll because the purpose of fine tuning is to get very consistent results fine tuning usually reduces creativity but it increases consistency um and so it's basically like you can also embed multiple prompts so in other examples of fine-tuning experiments you'll notice that i use several kinds of data or several kinds of prompts and put them all into a single data set so fine tuning can allow you to combine different kinds of tasks different kinds of data into a single model or it can allow you to get very consistent results so essentially you get the same behavior every time um yep so we've got 202 examples the biggest one is three kilobytes the smallest one most of them are one kilobyte um so let's just grab the top um several uh okay your opening paragraph is intriguing so that's fine first i want to commend you great all right we'll remove that because it doesn't need the bar the first thing i noticed is your stories that you have a lot of interesting ideas great first thing the opening sentence is great the opening is very strong opening is well written let me say enjoyed reading your story very strong effective in setting the scene well written and engaging opening paragraph is evocative remove this overall this is written in an interesting story um the start of the story is very strong your story has potential first thing i noticed present tense throughout okay cool so those all look good let's get the next chunk so basically what i'm doing is i'm just um auditing the data so this is this is a technique in data science where you don't look at every single one of them you just look at a few of them to make sure that you just eyeball them to make sure that they look fine so we're removing any artifacts that we don't want and again what this will do is it will fine tune our our model to be very consistent okay the opening of your story is very strong your story is well plotted etc etc you'll have a lot of potential all right so i still found a couple so basically what i'm going to do is i'm going to keep going through these chunks until i don't find any that need correction and it looks like maybe maybe many of them do the opening okay the beginning your story your story firstly what i might do is just do add a find and replace for this artifact because it seems like it's keep it's going to continue popping up so what i'll do is i will do a find and replace to remove that when i do the prepare data yeah it's just gonna keep popping up okay so here's how we handle that if there's a consistent artifact like that when you have the format format json l so what you do is you say okay for this and then the completion so we'll say the completion equals open file dot replace and then we'll add that and we'll say replace that with nothing and then we'll do a um was it strip yeah so that will remove any excess white space so that should clean that up um that should be good and i think we're ready to run this so let me jump back over here and we'll do python um oh yeah here's what we need to do um for file and files so basically here's what's going to happen um we're starting with the stories um actually no we should start the other way we should start with completions um so so for because the the the reason that we we're doing this is because the completions are a subset of the stories and so it's like okay we need to match a completion back to its story so we're going to get the list of completions so we'll say this so we'll say completion equals and then story equals oh yeah we need to move this up um is going to be stories.file okay and so because we know we because since we've deleted some of the some of the completions and the completions are a subset of stories we know that the story is going to be there but if we enumerate all the stories there's not a guarantee the completion will be there so this should be right it should not bomb out so one rule of thumb is um you shouldn't use the try except in python to cut to compensate for bad code or bad data um i use it because the api might be unreliable right but if you if you do a try except here you might end up with an entire block of bad data i've done that before so you you generally errors one rule of thumb is that errors are there for a reason and you want to see them so like if i did this wrong right if i hadn't swapped completions and stories then it would error out and i want to see that because i don't want to make assumptions about the quality of the data okay so format json l okay it did not bomb so that's good so let's go and look at our creative writing coach json l it's 800 kilobytes so if i had let this finish where it would have been almost twice as big this would have been like a 60 dollar fine tune um why did it say empty inside i wonder if that was the um the story okay so here's what we're gonna do um so there's the prompt it just starts with the story that's fine and then you see right here at the end um new line new line professional feedback completion the opening of your story is very strong that's exactly what we want to see um prompt a girls trip a trip to remember that was an understatement okay so we go here um and then let's find um professional feedback that'll make it easier to see where the dmarc is okay so the beginning of the story is very promising and i like the idea of two friends going on at road trip however the story quickly loses momentum in the middle great so find next yep you have a great eye for detail etc etc the way you use rust as a metaphor for the decay of civilization excellent um oh yeah so the fact that gpt3 can understand metaphors is great because neuroscientists don't even understand how human brains process metaphors so we've been able to recreate a quirk of biological intelligence and we don't even know how it works biologically and we don't know how how the machine has been able to recreate it so just wanted to point that out there where it's like people people that are suspicious of um you know like what gpt3 is capable of like it's smarter than you think okay the fine tuning data looks good so i'm going to go ahead and run the fine tune job and then we'll test it and that'll we'll call it a day um let's see python find do i not have the fine tuned script in here i copied the wrong one i guess oh i know what i did i copied the wrong one from auto muse there we go this should be the correct one uh where did i save it maybe i accidentally deleted it okay so we'll get the creative writing um creativewritingcoach.jsonl update the file name update the tag okay that should be good um python fine tune uploaded pending okay one last pause and then we will come back once this job is ready we're back it's finally done um let me show you how much this cost because i know you guys frequently ask this so in total it was 252 requests some were deleted some were testing we ended up with 202. the fine-tuned model was 8 um 800 kilobytes and that was 24 now moment of truth let's see how it went go back to the playground let's go ahead and get this model loaded it should be creative writing coach there we go and then let's go to writing prompts and we'll just grab one off the top of the stack because i'm not going to write one and i'm not going to ask gpt3 to write one it would be a little perverse if uh if we ask gbt3 to write a story and then critique itself although you can do that gpt3 can be adversarial against itself okay so we'll give it the story that is 800 tokens and then let's see what was the d mark that i used um i think it was like professional feedback or something so let me look at the um yeah professional feedback so we add some new lines at the end and then we let it go this might take a second to load because um openai with fine-tuned models sometimes it takes it a minute to load it into its brain waiting waiting and drink some tea all right i'm gonna pause it until this finishes loading not gonna make you wait okay success um it says the opening of the story is very strong you immediately set the scene and introduce the conflict i especially like the way you describe the war maiden and the demon the war maiden is very brave and determined while the demon is sly and manipulative however i feel like the middle of the story drags a bit the dialogue between the war maiden and the demon feels a bit stilted and unnatural you also introduce a lot of new characters which makes it difficult to follow the plot the ending of the story is also very strong the twist is unexpected and very clever overall i think you have a lot of potential as a writer with a bit of work i think you could produce some really great work keep up the good work um and then it started confabulating so fine-tuned models often do this and i'm not quite sure why um where it just starts repeating itself at the end the the medical the reduced confabulation with the medical fine-tuned bot that i did did the same exact thing where in one of the examples it just started repeating hiv repeatedly like it really wanted you to know about hiv so it might be that what we need to do is add like an end of text token and then use that as a stop with these fine-tuned models i'm not sure it also might get better with um with more data because i'm using the bare minimum 200 samples so anyways it worked um i hope you like this video like and subscribe and tell a friend

Original Description

The Kickstarter for my Post-Labor Economics book is live! https://www.kickstarter.com/projects/daveshap/labor-zero

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from David Shapiro · David Shapiro · 32 of 60

← Previous Next →

Raven MVP Demo 2021-04-02

Raven MVP Demo 2021-04-02

Get Started with Raven AGI

Get Started with Raven AGI

Coding Raven's Encyclopedia Service (v.1)

Coding Raven's Encyclopedia Service (v.1)

Prototype AGI demo - Natural Language Cognitive Architecture "NLCA" running on GPT-3

Prototype AGI demo - Natural Language Cognitive Architecture "NLCA" running on GPT-3

Raven Release 1 Deep Dive

Raven Release 1 Deep Dive

Fine-tuning GPT-3 to generate questions about anything

Fine-tuning GPT-3 to generate questions about anything

Fine-tuning GPT-3 for benevolent and trustworthy AGI

Fine-tuning GPT-3 for benevolent and trustworthy AGI

Implementing Natural Language Cognitive Architecture with GPT-3 and the "nexus" concept

Implementing Natural Language Cognitive Architecture with GPT-3 and the "nexus" concept

5 Tips and Misconceptions about Finetuning GPT-3

5 Tips and Misconceptions about Finetuning GPT-3

How to create synthetic datasets with GPT-3

How to create synthetic datasets with GPT-3

What is a heuristic imperative? What imperatives should we give AGI?

What is a heuristic imperative? What imperatives should we give AGI?

Talking Philosophy with GPT-3

Talking Philosophy with GPT-3

Talking Boundaries and Consent with GPT-3

Talking Boundaries and Consent with GPT-3

Convergence and acceleration towards AGI (or Artificial Cognitive Entities)

Convergence and acceleration towards AGI (or Artificial Cognitive Entities)

GPT-3 for Writing Dialog

GPT-3 for Writing Dialog

Co-writing flash fiction with GPT-3

Co-writing flash fiction with GPT-3

From zero to finetuned model in 1 hour with GPT-3. Generate a movie script from any premise!

From zero to finetuned model in 1 hour with GPT-3. Generate a movie script from any premise!

GPT-3 Working Session: Finetune an information companion chatbot in 30 minutes (RESEARCH ONLY)

GPT-3 Working Session: Finetune an information companion chatbot in 30 minutes (RESEARCH ONLY)

What is "toxic stoicism"? Talking philosophy with GPT-3

What is "toxic stoicism"? Talking philosophy with GPT-3

Billion-dollar GPT-3 startup! Fix education with an expert tutor chatbot!

Billion-dollar GPT-3 startup! Fix education with an expert tutor chatbot!

Finetune GPT-3 to write an entire coherent novel (part 1)

Finetune GPT-3 to write an entire coherent novel (part 1)

Concepts in Neuroscience and Cognition - Deficits of GPT-3 and the path to AGI and ACE

Concepts in Neuroscience and Cognition - Deficits of GPT-3 and the path to AGI and ACE

Finetuning GPT-3 to be a master tutor that can handle any topic and hostile students

Finetuning GPT-3 to be a master tutor that can handle any topic and hostile students

Testing "Theory of Mind" in GPT-3 - making fully aligned ACOG (Artificial Cognitive Entities)

Testing "Theory of Mind" in GPT-3 - making fully aligned ACOG (Artificial Cognitive Entities)

Finetune GPT-3 to write an entire coherent novel (part 2)

Finetune GPT-3 to write an entire coherent novel (part 2)

Finetune multiple cognitive tasks with GPT-3 on medical texts (and reduce hallucination)

Finetune multiple cognitive tasks with GPT-3 on medical texts (and reduce hallucination)

Finetune GPT-3 to write a novel - Part 3 (IT WORKS!!!) ...at least a little bit

Finetune GPT-3 to write a novel - Part 3 (IT WORKS!!!) ...at least a little bit

How will we know when we've invented AGI? How will we know it is complete?

How will we know when we've invented AGI? How will we know it is complete?

Finetuning a Creative Writing Coach in GPT-3 - Part 1

Finetuning a Creative Writing Coach in GPT-3 - Part 1

Finetune GPT-3 to write a coherent novel - Part 4 (success! with minor bugs...)

Finetune GPT-3 to write a coherent novel - Part 4 (success! with minor bugs...)

Recursively summarize text of any length with GPT-3

Recursively summarize text of any length with GPT-3

Finetuning a Creative Writing Coach in GPT-3 - Part 2

Finetuning a Creative Writing Coach in GPT-3 - Part 2

Increasingly Verbose Bot with GPT-3 - Expand any word or phrase into a whole paragraph

Increasingly Verbose Bot with GPT-3 - Expand any word or phrase into a whole paragraph

Metaprompting with GPT-3 to dynamically generate arguments

Metaprompting with GPT-3 to dynamically generate arguments

I'm taking a short break from research and YouTube

I'm taking a short break from research and YouTube

Are LaMDA or GPT-3 sentient? No, but...

Are LaMDA or GPT-3 sentient? No, but...

Can GPT-3 generate training data? Short answer? Yes! Here's why that's a legit methodology...

Can GPT-3 generate training data? Short answer? Yes! Here's why that's a legit methodology...

DALLE2 Style Tags Tutorial - "Elven archer in a sunny forest" with different tags

DALLE2 Style Tags Tutorial - "Elven archer in a sunny forest" with different tags

Many of you have asked for it: Join my new research Discord! Link in description

Many of you have asked for it: Join my new research Discord! Link in description

Answer complex questions from an arbitrarily large set of documents with vector search and GPT-3

Answer complex questions from an arbitrarily large set of documents with vector search and GPT-3

Fixing "goldfish memory" with GPT-3 and external sources of information in a chatbot - part 1

Fixing "goldfish memory" with GPT-3 and external sources of information in a chatbot - part 1

Fixing "goldfish memory" with GPT-3 and external sources of information in a chatbot - part 2

Fixing "goldfish memory" with GPT-3 and external sources of information in a chatbot - part 2

Python & GPT-3 for Absolute Beginners #1 - Setting up your environment

Python & GPT-3 for Absolute Beginners #1 - Setting up your environment

Python & GPT-3 for Absolute Beginners #2 - Your first chatbot

Python & GPT-3 for Absolute Beginners #2 - Your first chatbot

Python & GPT-3 for Absolute Beginners #3 - What the heck are embeddings?

Python & GPT-3 for Absolute Beginners #3 - What the heck are embeddings?

Introducing the RAVEN MVP - a general purpose AI companion (with a live DEMO)

Introducing the RAVEN MVP - a general purpose AI companion (with a live DEMO)

I needed SQLITE but for vectors so I wrote it myself. Now it's on PyPI - introducing VDBLITE

I needed SQLITE but for vectors so I wrote it myself. Now it's on PyPI - introducing VDBLITE

Prompt Engineering 101: Autocomplete, Zero-shot, One-shot, and Few-shot prompting

Prompt Engineering 101: Autocomplete, Zero-shot, One-shot, and Few-shot prompting

Prompt Engineering 101: Introduction to CODEX

Prompt Engineering 101: Introduction to CODEX

Prompt Engineering 101: Summarizing, Extraction, and Rewriting

Prompt Engineering 101: Summarizing, Extraction, and Rewriting

Summarize product reviews with GPT-3 fast and easy, get product insights and improvements fast!

Summarize product reviews with GPT-3 fast and easy, get product insights and improvements fast!

Finetuning GPT-3 101: Synthesizing Training Data

Finetuning GPT-3 101: Synthesizing Training Data

Finetuning GPT-3 101: Augmenting Training Data

Finetuning GPT-3 101: Augmenting Training Data

Finetuning GPT-3 101: Using Your Finetuned Model

Finetuning GPT-3 101: Using Your Finetuned Model

Modeling different viewpoints with GPT-3 for automatic debates

Modeling different viewpoints with GPT-3 for automatic debates

Finetune a perfect email generator in GPT-3 - take any input and generate a great email

Finetune a perfect email generator in GPT-3 - take any input and generate a great email

Research Update: Nexus microservice for Artificial Cognition + microservices architecture (MARAGI)

Research Update: Nexus microservice for Artificial Cognition + microservices architecture (MARAGI)

Research Update: Microservices! Text-based simulation, Embeddings, and Nexus

Research Update: Microservices! Text-based simulation, Embeddings, and Nexus

It's alive! The first 3 microservices are up and running!

It's alive! The first 3 microservices are up and running!

What is a Microservice? What does it have to do with AGI?

What is a Microservice? What does it have to do with AGI?

This video teaches how to fine-tune a creative writing coach in GPT-3 using various techniques such as prompt engineering, neuro-linguistic programming, and data combination. The video demonstrates how to use GPT-3, Python, and other tools to fine-tune the model and achieve desired results. By following the steps outlined in the video, viewers can learn how to improve the performance and consistency of their own language models.

Key Takeaways

Write a script to download stories from the r/writingprompts subreddit
Extract top-level comments as stories from the subreddit
Gather data from real humans to increase variability
Fine-tune a language model using GPT-3 and Python
Use prompt engineering to improve model performance
Combine different kinds of tasks and data into a single model
Remove artifacts from the data
Fine-tune the model to be consistent
Use find and replace to remove consistent artifacts
Handle inconsistent data using try-except blocks

💡 Fine-tuning a language model can significantly improve its performance and consistency, but it requires careful consideration of the data and prompts used to train the model.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Fine-tuning LLMs

View skill →

Fine-tuning T5 LLM for Text Generation: Complete Tutorial w/ free COLAB #coding

Fine-tuning T5 LLM for Text Generation: Complete Tutorial w/ free COLAB #coding

Train image classifier using transfer learning - Fine-tuning MobileNet with Keras

Train image classifier using transfer learning - Fine-tuning MobileNet with Keras

Advanced Fine-Tuning in Rust

Advanced Fine-Tuning in Rust

GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)

GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)

LLM Fine-tuning: Two Crucial Tips for New Models - LLama 2

LLM Fine-tuning: Two Crucial Tips for New Models - LLama 2

SDXL LORA STYLE Training! Get THE PERFECT RESULTS!

SDXL LORA STYLE Training! Get THE PERFECT RESULTS!

Related AI Lessons

Sub-10ms AI Workflows: Accelerating sim.ai with On-Device Semantic Search using Moss

Learn how to accelerate AI workflows with on-device semantic search using Moss, achieving sub-10ms response times and improving user experience

Medium · Machine Learning

Stop Guessing: Guaranteed Structured Output from LLMs in Node.js

Learn to guarantee structured output from LLMs in Node.js and stop parsing JSON manually

Dev.to · Hardik Mehta

Spring AI Tutorial — Your First REST Endpoint with OpenAI (2026)

Build a REST endpoint with Spring Boot 3 and OpenAI to create an LLM-powered API, leveraging the power of AI in your applications

Notes: Memory, Context, and Large Language Models (LLMs)

Learn how memory and context work in Large Language Models (LLMs) and potential improvements

Dev.to · Vladimir Panov

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)