Skeleton-of-Thought: Building a New Template from Scratch
Key Takeaways
This video demonstrates how to build a new LangChain Template from scratch based on the research paper 'Skeleton of Thought' from Tsinghua University and Microsoft Research, utilizing LangChain Expression Language for parallel decoding and task execution. The template is designed to generate a skeleton and expand each bullet point with another LLM call, allowing for independent execution of tasks and speeding up the process.
Full Transcript
all right so today we are going to implement a new template um and we're going to implement it for this new paper that just came out um skeleton of thought large language models can do parallel decoding so I saw this on Twitter I forgot where I saw it but I thought it was really cool um and it plays to some strengths of Lang chain and of Lang chain expression language namely it's going to use multiple llm calls it's going to break things down into small chunks and then a bunch of those LM calls are going to be conducted in parallel um and so with Lang chain expression language we can do that fairly easily so we're going to create a template um for those and for Miner what templates are those are are really easy way to get started with any application um and so we have a bunch of predefined templates here um but we're going to create a new one um and so we should have instructions on how to add um yeah all right so let's create a new template I'm if you look here I'm in my workplace Lang chain templates directory um L chain template new um skeleton of thought and something there we go all right uh L all right um so I now have this uh uh example template that was just set up I can go inside it um let's find it where are we skeleton of thought um let's get add everything in here get check out the Harrison skeleton of thought all right just checking out a clean Branch um we've added a bunch of stuff in here we can see that we have a really simple read me um so I need to add a description here I'll do that later environment variables I'll do that later um and then a bunch of predefined uh documentation there got some p project. tunels with some dependencies we'll use open AI for this maybe we'll go into how to do this with a different language model should be pretty easy and then the main part here is just chain we have a really simple dummy chain here but of course what we're going to do is we're going to implement this new paper skeleton of thought so what is this paper I'll post a link to this in the YouTube description as well but basically the idea is it's a really cool idea if we take a look at here if you get a uh question like what are the most effective strategies for conflict resolution in the workplace a normal llm response would just be generating this answer one by one what skeleton of thought does is it generates a skeleton really short bullet points and then it expands each bullet point and the reason reason I like this is it um it speeds things up so it's really really fast which is good um and then also this this kind of speaks a little bit to kind of using the language model to plan and then execute on things so here each execution is just expanding with another language model call but you could easily imagine that each execution could be another you know it could it could it could write a whole paragraph it could write a whole research report for each bullet point um it could actually take actions and do them um importantly like here the the reason that this is helpful and that you can do this as opposed to like a react style agent or something is that all of these are kind of independent so you can do one without doing two and you can do two without doing three or two without doing one um and that makes it really easy to paralyze them which leads to the speed up so if you're doing things where it relies on the results of previous steps that may not be as good of fit for this so let's see see okay so this is prompt one this is prompt two that's great it looks like they're already in Okay Okay so we've got one and this is being used to generate the skeleton so let's copy this um skeleton generator template there's some weird like new line stuff here um I'll just do that now that seems fine and then um skeleton generator prompt and then we're just going to do from template from this template um let's um we're going to import this because now we're going to create our chain our Chain's going to be really simple the first chain so we're going to create our first chain which is generating the skeleton so it's going to be this prompt and we're going to pass this in toi and then we're just going to parse out the string we're probably going to actually need to do other things we're going to need to parse out the bullet points because we're going to want to work with those but for now we can just do this um and then let's test it out um let's see do they have a good example in the all right so this is the example they given the paper we'll use that um some weird stuff happens when copying things over okay let's try that out um let's print out this just to make sure we're also logging things to lsmith um so uh what lsmith is is the debugging logging um uh tool that we've built on the side um and so what I've done before is I've basically copy pasted I'm using my Dev account but um we we you should use the the regular link Smith account it's smithl chain.com um if you you can sign up for Access there it is on a wait list if if you don't have access shoot me a DM on on Twitter or LinkedIn um and we can give you access pretty easily um once you do that I've just exported these three variables you want to pop in your API key there um I've already done that but you you should do that and then what that does is it will Trace everything that we do so here let's do python skeleton of thought chain let's run this and so it should print it out here okay so this uh uh spits out the um skeleton 10 bullet points okay that's a lot we can also see that it should show up um here um so here we have this runable sequence we can see it's really simple it's just a prompt template I'll out preparer what I really like about um Ling Smith is it will let you open things up in a playground so here if we wanted to play around with the prompt for whatever reason um like let's say we wanted to try to change this to just have I don't know two to three points then we do that and we can rerun it and here we can see that just generates two points so this is really useful for debugging things especially when they're in Chains It's also useful for seeing what's going on under the hood which right now doesn't really matter because it is quite simple it's just this but when it comes time to add in the next point the point expanding stage it will come in quite handy so let's take in this point expanding stage and let's do Point expander template um all right we're going to add this in um okay so now this takes in a few different variables Let's uh let's make this point expander prompt chat promp template from template put expander template so this this takes in question skeleton which is what we just generated um Point index and then point skeleton okay so what we're going to want to do is we're going to want to take let's see if there's any good diagrams for here I guess the main diagram wow okay there's a big appendix for this what's in this appendex oh okay so they run it on a a lot of different they have some yeah okay so they have a bunch of different prompts for done different models which is really cool um we'll probably just do a basic one uh one shot prompt um oh that's cool so they do some that's cool so they do they have some like routing between it looks like just a regular response and one that needs a skeleton so they're determining whether this is even needed um anyways long appendex might read that in more detail later um but the main idea is that what we're going to want to do is we're going to want to take each of these bullet points and then expand it so in our prompt template we have question skeleton Point index Point skeleton so it looks like this is the question this is the skeleton the point skeleton would be here Point index would be one point skeleton would be there I think they actually have code for there yeah there they go or there we go they do have some code let's see if we can find where they have prompts in there oh it's probably in here skeleton of thought um what I'm looking for is a parser that'll turn this into a list but we can probably do that okay this isn't really that helpful this isn't really that helpful at all okay so what we're going to want to do is we're going to want to write a some snippet that is going to construct a list of things that we're going to expand so it's going to create a list of things from this output we're ALS we're going to want to have each element in that list should be a dictionary with four things the question the skeleton the skeleton point and then the uh the skeleton index so what that's going to look like is first from Lang chain. schema do runnable import runnable pass through all right so we're going to create let's um let's rename this just skeleton generator chain um okay so we're going to create our final chain or a final chain as run pass through. assign um we already have question that comes in so now what we really want is skeleton and we're going to pass skeleton as just skeleton generator chain so this is going to add a new variable called skeleton and it's going to be the result of calling this great um then what we're going to want to do is we're going to want to turn this into so this will be a single dictionary with question and skeleton we're going to want to turn this into a list of things and then we're going to have our Point Xander actually yeah let's try X expander chain Point expander prompt chat chat opening ey string out parser cool let's try that um Let's ignore this for a moment let's try let's so let's try out this so let's just pretend that would got to the point where we had the question we had this as skeleton um oh that's not that skeleton let's pretend we had this as skeleton um Point index as one point skeleton let's do this cool weird all right so for whatever reason the variables had some spaces in them I don't like that um did it not save um ah there we go okay so it expanded the first one that doesn't actually expand it that much whatever let's um we we fix that with some prompt engineering later on if we really want to so we've got this point expander thing working well now what we really want is we want to write a python function that parses a single number list into a list of dictionaries with each element in the list um add two keys a index key for the index in the number list and a point py for the content let inlo they look something like this and so here we had prompted it with one so I'm going to add that in there and I I'll show how we can do something to just get that so so when it generated it it didn't have the one in front of it but I'm adding it um to start oh interesting it looks like it's using Code interpreter okay cool um we will wait for this to finish in the meantime we can see what was going on with the point expander solution um so this was the point expander um we can see that this is what the um prompt ended up looking like by the time it got into the uh to the uh to the to the llm so this is the fully formatted thing um we're actually missing a one zero one point something there I think so okay cool that that's interesting to know um we'll have to so basically this doesn't generate the one at the start because if you notice we have it here so what I'm actually going to do is I'm going to add something here that's just Lambda X um one so that's just adding it to the response that'll make it just a little bit more convenient to pass around go back to open AI that looks reasonable to me interesting that this chat GPT is getting updated every day it seems all right let's copy this okay so um this par is the list so what I also want to do is um create list elements so so I want something that takes in the output of this which is a dictionary that has question and skeleton and then creates things that can be passed into the point expander which is question skeleton Point index Point skeleton so I'm going to pretend that's input I'm going to um do skeleton equals input skeleton then I'm going to do number list equals par number list skeleton um I'm going to change these to match those things and then for L list L skeleton skeleton why is that getting highlighted I have name r vers skeleton weird um L question question input question return numbered list cool so now I can add this in here and I can do create list elements cool um then I can do ex Point expanded chain should be expander let's fix that um expander expander expander expander thing um math then this is going to get back a list of expanded things and what I want to do is basically uh pass those in uh into or yeah I want I want to combine those into a final answer um so let's just write something like def Final Answer expanded points Final Answer equals that or actually um yeah we don't even need this we can just do Lambda X um join X that should return just a string because it's getting a list of strings back let's try some stuff now delete everything except question um yeah let's run it and see what happens oh okay that was pretty fast so that was faster than I expected so it generates all of this um generates all of this uh by uh well let's see what's going on under the hood this is how it generates all of this and if we expand this okay there we go a lot more LM calls so all these yellow blocks here are LM calls so it actually made a lot of LM calls under the hood it was just done very fast if you look at them in sequence this is the first one and it generates the skeleton and then this is uh mapping over each of them this is expanding the first one um and uh basically so this is expanding the first one of identifying the root cause this is expanding the second one of encouraging open communication blah blah blah we get to The Final Answer um where and then this is the final answer which it takes in all of these and gets back this output if we wanted to change this output in some way um let's uh uh let's maybe let let's let's show how we can do that if we wanted to like format this a little bit nicer right now this is just joining them with that um as you can maybe guess we could just write a simple function like def get Final Answer um which we started to do before but I wanted to make sure it was working turns out it was working it's quite simple actually um final answer string use that um actually let's here's a comprehensive answer um for L and expanded list um and and you know what let's add like that and enumerate expanded list um equals I do L turn Final Answer string drop that in there run it again so now we should get a slightly nicer formatted um final answer so yeah here and it's zero index you know we can we can fix that by going like this but the point is we get a nicely formatted um thing here's a final answer we go back here we can see a new thing popped up input is this question output is this here's a comprehensive answer um okay so it looks like um okay so this is actually helpful so some of them so it looks like it largely okay so this is actually pretty cool so what we can see is going on is the first one it just starts with is and that's because we have this prompt where we ask it to basically continue from the um first one so what we actually want to do is we actually want to add a new element into this thing which is going to be um we can call this expanded answers um and then we can pass it here now this is uh not going to get in the expand so so what we're so basically what's happening is our expanded answers are continuing from here so what we actually want to do is basically take um the uh take this thing and then take this thing and then add in um the add in this part to there so what that's going to look like now is we're going to have something that has question skeleton and then expanded answers um and um or actually what we can do actually I know what we're going to do we're going to reverse this that's fine what we're going to actually do is we're going to change the point expander chain um and we're going to do a runnable runnable pass through. assign here we're going to have continuation there and then we're going to do Lambda x x X um uh it's what are we calling it we're calling it The Point skeleton plus the continuation okay so let's run this see if this works and then I'll explain what it does if it does work okay so now we can see that each bullet point is more of a uh fully featured or of an actual sentence we've still got some weird things going on there likely so let's let we let's debug this a little bit more um so here we have this map now each of these is now basically it's generating this answer um it's generating this answer and then it's appending it to this um and so the weird thing that's happening is that there isn't a space but there should be a space the reason there isn't a space is because why isn't there space well let's see oh I mean there's just there's just no space there um okay so what we can just do is in case there is a space we'll strip then we'll in space um and then there so if we run this now okay that looks good there's proper spacing um they're all uh full sentences and it looks like that should work so um and if we look at what's going on under the hood we have the full thing here um we can expand it out we can see all the calls that are getting called we can jump into any place um debug it hop into a playground try it out with different language models as well one of the cool things that we did um where there we go one of the cool things that we did is we actually worked with fireworks and Google to have some free models here let's see how llama L of 2 13B does um h all right this uh all right so llama llama 13B is not amazing at this where is what about mistr Moll is a pretty good model all right I'm probably not using us the right uh tokens to uh prompt myal correctly so let's stop doing that um but basically yeah this is this we've we've added skeleton of thought as a template to laying chain um oh one thing that we can do now is we can actually see this in action um so if we let's just delete this let's check P project. toml skeleton thought chain yep cool that's all right um what we can do is I think we can just do L chain serve from here could not import model module defined in P project. toml so I think we need do p install D in here to install this module now if we do L chain serve okay so I'm on yeah so I'm on pantic two um we really should make this pantic one by default um but now if I go to this SL playground okay so what I need to do is it can't because I'm using this thing it can't automatically infer what the um inputs are correctly um so I'm going to from base model class input base model with types input type chain input um I think that's do I I don't know if I have to type the output or not um cool okay so we get this um what the most effective strategy contact for what most strategies for conflict resolution in the workpl um yeah so we can see the intermediate steps jump up a lot we get back our answer um this actually also automatically logs things to link Smith so this is uh this is the one that we just ran very recently um but now we have a playground to play around with this as well so if we wanted to share this with uh anyone we can uh spin up a little playground this is just served by fast API and do it this way okay I think that's really all I have now thank you
Original Description
In this video we will build a new LangChain Template from scratch. The template will be based on a recent research paper out of Tsinghua University and Microsoft Research - "Skeleton of Thought".
This will cover how to use LangChain Expression Language to easily compose and parallelize LLM calls, how to use LangSmith to debug during development, and how to use LangServe to quickly get a production ready endpoint created.
Key Links:
- Skeleton-of-Thought Template: https://github.com/langchain-ai/langchain/tree/master/templates/skeleton-of-thought
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from LangChain · LangChain · 24 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
▶
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Chat With Your Documents Using LangChain + JavaScript
LangChain
LangChain SQL Webinar
LangChain
LangChain "OpenAI functions" Webinar
LangChain
LangSmith Launch
LangChain
LangChain x Pinecone: Supercharging Llama-2 with RAG
LangChain
LangChain Expression Language
LangChain
Building LLM applications with LangChain with Lance
LangChain
Benchmarking Question/Answering Over CSV Data
LangChain
LangChain "RAG Evaluation" Webinar
LangChain
Fine-tuning in Your Voice Webinar
LangChain
Tabular Data Retrieval
LangChain
Building an LLM Application with Audio by AssemblyAI
LangChain
Superagent Deepdive Webinar
LangChain
Lessons from Deploying LLMs with LangSmith
LangChain
Shortwave Assistant Deepdive Webinar
LangChain
Cognitive Architectures for Language Agents
LangChain
Effectively Building with LLMs in the Browser with Jacob
LangChain
Data Privacy for LLMs
LangChain
"Theory of Mind" Webinar with Plastic Labs
LangChain
LangChain Templates
LangChain
Using Natural Language to Query Postgres with Jacob
LangChain
Building a Research Assistant from Scratch
LangChain
Benchmarking RAG over LangChain Docs
LangChain
Skeleton-of-Thought: Building a New Template from Scratch
LangChain
Benchmarking Methods for Semi-Structured RAG
LangChain
LangSmith Highlights: Getting Started
LangChain
LangSmith Highlights: Debugging
LangChain
LangSmith Highlights: Datasets
LangChain
LangSmith Highlights: Evaluation
LangChain
LangSmith Highlights: Human Annotation
LangChain
LangSmith Highlights: Monitoring
LangChain
LangSmith Highlights: Hub
LangChain
SQL Research Assistant
LangChain
Getting Started with Multi-Modal LLMs
LangChain
Build a Full Stack RAG App With TypeScript
LangChain
Auto-Prompt Builder (with Hosted LangServe)
LangChain
LangChain v0.1.0 Launch: Introduction
LangChain
LangChain v0.1.0 Launch: Observability
LangChain
LangChain v0.1.0 Launch: Integrations
LangChain
LangChain v0.1.0 Launch: Composability
LangChain
LangChain v0.1.0 Launch: Streaming
LangChain
LangChain v0.1.0 Launch: Output Parsing
LangChain
LangChain v0.1.0 Launch: Retrieval
LangChain
LangChain v0.1.0 Launch: Agents
LangChain
Build and Deploy a RAG app with Pinecone Serverless
LangChain
Hosted LangServe + LangChain Templates
LangChain
LangGraph: Intro
LangChain
LangGraph: Agent Executor
LangChain
LangGraph: Chat Agent Executor
LangChain
LangGraph: Human-in-the-Loop
LangChain
LangGraph: Dynamically Returning a Tool Output Directly
LangChain
LangGraph: Respond in a Specific Format
LangChain
LangGraph: Managing Agent Steps
LangChain
LangGraph: Force-Calling a Tool
LangChain
LangGraph: Multi-Agent Workflows
LangChain
Streaming Events: Introducing a new `stream_events` method
LangChain
Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve
LangChain
OpenGPTs
LangChain
Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)
LangChain
LangGraph: Persistence
LangChain
More on: Reading ML Papers
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way
Medium · AI
ICMI 2026 Reviews [D]
Reddit r/MachineLearning
Workshop submission for main conference paper under review [D]
Reddit r/MachineLearning
Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]
Reddit r/MachineLearning
🎓
Tutor Explanation
DeepCamp AI