PAL : Program-aided Language Models with LangChain code

Sam Witteveen · Intermediate ·🧠 Large Language Models ·3y ago

Key Takeaways

The video explores Program-aided Language Models (PAL) and its implementation in LangChain, providing code examples and demonstrations of its capabilities.

Full Transcript

okay in this video I'm gonna go through pal which is program aided language models I'm going to go quickly through the paper and some of the key points in the paper and their website and then I'm going to show you the code for implementing and using this in Lang chain so this paper comes out of CMU which has a very good LP team there and these are the authors you know number of authors that have worked on it all right this really is an interesting paper in that it's kind of going along with this new theme of that rather than just train up models to get bigger and to get better at things you can also look at using in context learning and using few shot prompting to basically manipulate these models to do something with a tool with an external tool that's outside of the actual language model so what a lot of this is doing is used for math now math has been an issue for language models for a long time that they've had problems with doing even simple things like additional multiplication with large numbers I and that a lot of that has to do with just the way that they're trained so if you think about it they will learn a lot of things with small numbers in certain scenarios because they see that a lot in their training data but they don't really learn sort of the logic of how how to do multiplication or division or even multiple things all in the same the same equation so what this basically proposes is that rather than rather than just export out the answer why don't we export out a Python program and this Python program can then be used to actually calculate the answer so it can be run with PRI python read evaluate print Loop and then that can be used to basically give us the answer so it's interesting here that they're using the Codex model so they've found that this works better that the actual code models which makes sense because they're trying to generate code here and the goal here is to basically offload a solution to a python interpreter right this is what they're doing so if you look at the ID here of you know what's being done in the past the Chain of Thought stuff was really interesting when that came out last year and the idea there was just by asking the language model to basically give us step-by-step thinking for it could it basically give us the answer and it turned out that this worked in a certain amount of time certainly much better than just asking for the answer so I you would often see things like let's think about this step by step a lot of those things are prompting for Chain of Thought whereas what this is doing here is basically getting the model to Output a set of variables which we can then use so we can basically take this Rogers tennis balls right and this tennis balls equals five this is just like a python thing two cans of three tennis balls each so this is going to be the bought balls is two times three and then the answer is going to be the tennis balls plus bought balls so you can see that doing these things like this allows the python interpreted to actually do the math part the language model the other has to do the leg the math pot it's just basically doing what it's just creating the function that then it's going to be that then is going to be run in Python itself so the paper's very interesting I encourage you to go and have a look at it they focus on certain types of math and they focus on certain types of questions you'll see that they use this m8k Benchmark which I'm pretty sure stands for grade school math 8K and these are basically math word problems and this is this is one of the data sets there's another one GSM hard which they use for this and you'll see that going through this it talks a little bit about how they craft prompts for this how they set those prompts up to basically then export out or to predict out a python function so if comparing to the bass lines this does does far better so there's some really nice implementation points in here of showing us that okay that they're using greedy decoding with a language model with a temperature of zero and they're using this codex code DaVinci 2 model so you'll see when we get to the actual code part we're going to use the same model that they're using in the actual paper let's have a look at the website so they have this nice website and I'll put the links in the description for all of this and you can see that not only do does this do well but when it's compared to the other forms of doing this kind of thing it does you know very well you see just direct prompting doesn't do very well at all the Chain of Thought prompting we see quite a big jump for a lot of these and then the pal which is an even bigger job for a lot of these so these are various data sets that they've tested on the GSM 8K is the one I've just been talking about and the GSM hard you can see really big jumps on some of these so GSM 8K was one that people used a lot for comparing big models you can see that just by adding this because we're take we're taking the heavy lifting out of the model and putting that into the actual code and the functions that we're using all right so some sample outputs they've got a bunch of them here that you can look at which will show you the question will show you the The Chain of Thought what how the Chain of Thought handles this and then it will show you how actually how it's turning it into a python function so that you can see this so I'll show you in the code that you might want to actually sometimes change the prompts so this could be where you could get more examples to use in prompts for this so for using the in context learning using some of these you could actually come come through and look for the kind of thing that you particularly want to do so they've got a whole bunch of sort of standard ones in here they've got the the GSM 8K hard so this is often done a lot more things to do with like floats some different types of questions you're going to see we've got the colored objects questions so in Lang chain it actually has a colored objects prompt so it has its own colored objects pal chain so you'll see that in the code in a second all right and then we've got things like date understanding so the the Lang chain ones don't seem to do that well with this so this is something that I would look at maybe adding as a prompt or something to the work you're going to do in Lang chain to see okay how does it do with these kind of things another one that it perhaps doesn't do as well is these repeat copy things so where you're asking it to do a specific task and you'll see we'll go through it through some of these in the code so have a look at the paper the researchers have done a really good job here it's pretty easy paper to read and see what's going on and it allows you to understand okay what's actually going on in the code so that if you want to change some of this or you want to experiment with this for other things with exporting python functions you would be able to do this kind of thing all right let's jump into the code Okay so we've gone through the paper let's jump in and have a look at the code so it turns out that Lang chain has pal tool in there or Pal chain rather in there that we can use so this is basically bringing it in it's quite simple for bringing this in and you'll notice that we're not using the the normal DaVinci 3 model or a chat gbt model we're actually using one of the code models for this so this is to make it like the paper as we go through this so I've put up some examples that you could play with here I will look at some from the paper and some from some other papers to see how they work and then I'll show you some examples of things that don't work as well so setting up the the chain is pretty simple there are three different sort of versions of it the main one we're going to start out with is the from math prompt and then I'll show you the one for colors and another one for step-by-step stuff as well okay we've brought this in we're doing verbose equals true so that we can see what's actually going on and stuff as we go through this so you can see here we've got this is from one of the flan papers from memory it's very simple sort of thing and the big models can do this straight out right and a lot of times they will do it with the Chain of Thought prompt to get them to be better at this but certainly smaller models and even some of the bigger flan T5 models struggle at this kind of thing so you can see here we've got the cafeteria had 23 apples if they use 20 for lunch and bought six more how many apples do they have and sure enough we can see that when we run this pass this question into the power chain and run it we can see that this is actually converted what we've got here and extracted the variables right so the initial apples is 23 apples used Apple Sport and then constructed a very simple math equation for this and it's wrapping it all in Python and then running that python so that we can see the final output is going to be no fine right which 23 minus 20 plus 6 equals nine and we can see here that if we look at that this is the question that I put in this is the actual template that we're doing here like I always say you should look at the templates yourself so that you can learn from them looking at this it's a bit hard to see so what I'm going to do is just copy this over and we'll just post it in here so that we can actually read the full template so for a start you'll notice that the template's quite long and this sort of fits with often up here we will set the large language model to be 512 tokens for this because we're going to have a long prompt that we're putting into this right and definitely using the pal we'll use a decent amount of tokens so you can see that what we're doing here is the in context learning we're basically just giving in examples of a question Olivia has 23 dollars she bought five bagels for three dollars how much money does she have left and then we're giving it the python function that it should generate right so that this would be a solution in Python and it should generate the output and it should always make the variables related to the sort of the actual words in the problem and we can see that okay sure enough it's gone through it's done that now notice we're not giving it the answer because we're going to take this and we're going to run this actually in Python so we don't want the language model to give us the answer we just want the language model to give us the python function which we can then basically run so we've got one example there got another one and this is one of the big things that I want you to take away from this is just how many examples they're giving in the in context learning here so we've got one two three four five six seven eight all right so there are eight examples for this to learn from before it starts doing it and then we just pass in the question that we've got and we're going to tell it solution in Python and then it's going to generate out a function which we will then take and run so you can look at this you'll notice that it always initializes the variables with variable names that relate to the problem it should do the math in there now you'll see that most the math in this particular example is reasonably simple and this could be the reason why it'll get some wrong that I'll show you later on so you could actually even mess with this prompt a little bit more if you're going to do some examples around time or some examples around other things so let's go on that's the actual prompt if we jump in and look at to some of the examples from the data sets used in the paper so remember the GSM 8K this is grade school math questions and we can see that here we've got a question and it's running it and it's working it's generating a python function with variables that match the question it's putting each of them in the right place and then it's exporting then running that and giving us the answer out so we can see we've got multiple ones and I encourage you to go through and play with the collab yourself and try putting in changing these up and playing with it I've taken some of these from the actual the page with the paper on it that we looked at a little bit before so you can see some of these examples are from here where we've got the GSM 8K outputs we've got various other outputs that we can look at in there and you can see that if you go through it can do it can do basically integer math it can do float math it can do a variety of different things one of the kind of examples that I find very interesting are these final examples of repeat copy so you'll see this is where the question will be something along the lines of asking it to do something with language so in this one say the letters of the alphabet in capital letters but only the odd ones all right another one is the repeat cheese seven times every third say whiz so cheese cheese whiz cheese whiz cheese okay so if we look at uh running this in Lang chain it doesn't do quite a great job we're getting cheese we've got no spaces in there right but okay let's forgive that we've got lots of cheeses before we get and then we've got the Wizards right at the end so it's probably getting this wrong because there's none of these examples in the actual prompt so you could actually go and change the prompt or maybe we look at doing you know that in a future video or something if people are interested please comment and let me know I but the one about the letters it does get right so you can see here that okay say the letters of the alphabet in capital letters but only the odd ones these are the answers from the paper and sure enough it gets it right another site that I thought would be interesting is actually just to take a site that's made for children to to learn this stuff I and this is a whole bunch of different questions so you can put these in there and it tells you actually sort of what grade they are and this goes I think from third grade up to eighth grade and some of them some of them are very easy but then some of them do become challenging so I've put a bunch of those in there that you could play around with and see which ones that it gets and it gets most of them right this is a kind of interesting one so interpreting basically we want to see okay can it work out the remainder when we're actually doing division what's left over and you can see sure enough it does the division and then it does the remainder so it's using the modular there to basically work out the the remainder for this mixed operations doing multiple things and stuff like that it seems to do pretty well with those some things related to percentages right and this one is actually a little bit more like reasoning where it's asking it to do things and then you have to say okay which group is bigger and you can see it's written a whole bunch of conditional logic in the function here to work this out so that these are worth having a play with and you could think about where you could use these in in various places and then we get some failures it doesn't do very well with time so this is basically I wake if I wake up at 7 00 a.m and it takes an hour and a half to get ready and walk to school what time will I get to school obviously the answer here would be 8 30 I but it says eight so it's looks like it's getting some things wrong in there and I think it's basically again I think this is more because there's nothing like this in the prompt and so maybe you could improve this by playing around with the prompt another one was ratios that it seemed to get wrong so this one it's looking to get a ratio of where the difference between these I'll let you go through and play with this and read it yourself is 12 including the difference between 21 and 9 would be 12 but 21 in my minus nine it hasn't done a great job there okay the colored objects you can see this is the second kind of power chain in length chain so this is specifically for working with colored stuff and I've played around with these a little bit so you can see that okay on the desk two blue booklets two purple booklets one purple hat and two yellow pairs of sunglasses if I remove all pairs of sunglasses how many purple items do I have it remains so we would expect that we're gonna see two and three here so it's interesting to see the logic of how it does it it basically makes a list it adds in versions of these lists right so we can see that we've got two for those ones we've got one for our purple hat and then it goes and removes all the sunglasses in there and then it counts the number of purple things and it gives us the right answer there so this one I'm not sure exactly what you would use it for but my guess is that maybe there are certain times when colors are going to be really important to things you could probably play around with this prompt to make it something like the number of widgets so if you're building some kind of chat bot or something it was doing maybe some sort of like accounting task where people say all right we had 15 people by widget a 27 by widget B 36 by widget C seven people returned widget a three people returned widget B this kind of thing you could probably change this prompt to fit your very specific example so that it would get the math right for something like that and the final one is this idea of this immediate steps idea what this will basically do is it will just break down what it was doing into the different steps so that we can see this so if we look at the immediate steps and what I've done here is basically just copy the output here of the immediate steps we can see that it's actually broken it down in parts of the function so if you were getting something wrong you could come in here and actually do a run through with the breakdown of the immediate steps and then use that to determine okay should I try and adjust the prompt or something like this so it's not this power thing is not perfect it won't work all the time but it's amazing that it works as much as it does it certainly in the past before this paper people were just generally relying on things like Chain of Thought and just trying to expect the large language model to return this so it's a really cool idea and I think it's this idea we're going to see used a lot more in the future of where people use a python function to basically do something and then you're getting the language model just to write the python function based on on a particular natural language query kind of thing so anyway hopefully this was useful to you if you have any questions please put them in the comments I will always try and go through and answer comments if this is useful to you please click subscribe and let me know what kind of videos you would like to see going forward alright thanks for watching

Original Description

This video goes through the paper Program-aided Language Models and shows how it is implemented in LangChain and what you can do with it. To access all the code please use the Colab link below. Let me know in the comments what other videos you would like to see. Colab:[https://drp.li/cdhf2](https://drp.li/cdhf2) PAL Paper website: https://reasonwithpal.com/ PAL Paper: https://arxiv.org/abs/2211.10435 120 Math Word Problems: https://www.prodigygame.com/main-en/blog/math-word-problems/ My Links: Twitter - https://twitter.com/Sam_Witteveen Linkedin - https://www.linkedin.com/in/samwitteveen/ Github: https://github.com/samwit/langchain-tutorials https://github.com/samwit/llm-tutorials 00:00 PAL Paper 02:35 PAL example prompts in the paper 04:52 Paper Results vs CoT & Direct prompting 05:37 Sample Outputs 07:34 Code Walkthrough 09:35 PAL LangChain PromptTemplate 11:57 GSM8k examples 12:55 Repeat Copy examples 14:06 Word Math Problems 16:00 Colored Objects #LangChain #BuildingAppswithLLMs #python
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Sam Witteveen · Sam Witteveen · 8 of 60

1 LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab
LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab
Sam Witteveen
2 LangChain Basics Tutorial #2 Tools and Chains
LangChain Basics Tutorial #2 Tools and Chains
Sam Witteveen
3 ChatGPT API Announcement & Code Walkthrough with LangChain
ChatGPT API Announcement & Code Walkthrough with LangChain
Sam Witteveen
4 Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference
Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference
Sam Witteveen
5 LangChain - Conversations with Memory (explanation & code walkthrough)
LangChain - Conversations with Memory (explanation & code walkthrough)
Sam Witteveen
6 LangChain Chat with Flan20B
LangChain Chat with Flan20B
Sam Witteveen
7 LangChain - Using Hugging Face Models locally (code walkthrough)
LangChain - Using Hugging Face Models locally (code walkthrough)
Sam Witteveen
PAL : Program-aided Language Models with LangChain code
PAL : Program-aided Language Models with LangChain code
Sam Witteveen
9 Building a Summarization System with LangChain and GPT-3 - Part 1
Building a Summarization System with LangChain and GPT-3 - Part 1
Sam Witteveen
10 Building a Summarization System with LangChain and GPT-3 - Part 2
Building a Summarization System with LangChain and GPT-3 - Part 2
Sam Witteveen
11 Microsoft's Visual ChatGPT using LangChain
Microsoft's Visual ChatGPT using LangChain
Sam Witteveen
12 Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo
Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo
Sam Witteveen
13 LangChain Agents - Joining Tools and Chains with Decisions
LangChain Agents - Joining Tools and Chains with Decisions
Sam Witteveen
14 Investigating Alpaca 7B - Finetuned LLaMa LLM
Investigating Alpaca 7B - Finetuned LLaMa LLM
Sam Witteveen
15 Comparing LLMs with LangChain
Comparing LLMs with LangChain
Sam Witteveen
16 Running Alpaca7B in Colab
Running Alpaca7B in Colab
Sam Witteveen
17 How to finetune your own Alpaca 7B
How to finetune your own Alpaca 7B
Sam Witteveen
18 How to make a custom dataset like Alpaca7B
How to make a custom dataset like Alpaca7B
Sam Witteveen
19 Understanding Constitutional AI - the paper and key concepts
Understanding Constitutional AI - the paper and key concepts
Sam Witteveen
20 Using Constitutional AI in LangChain
Using Constitutional AI in LangChain
Sam Witteveen
21 Talking to Alpaca with LangChain - Creating an Alpaca Chatbot
Talking to Alpaca with LangChain - Creating an Alpaca Chatbot
Sam Witteveen
22 Text-to-video-synthesis with Diffusers and Colab
Text-to-video-synthesis with Diffusers and Colab
Sam Witteveen
23 Meet Dolly the new Alpaca model
Meet Dolly the new Alpaca model
Sam Witteveen
24 Checking out the Cerebras-GPT family of models
Checking out the Cerebras-GPT family of models
Sam Witteveen
25 A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)
A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)
Sam Witteveen
26 Is GPT4All your new personal ChatGPT?
Is GPT4All your new personal ChatGPT?
Sam Witteveen
27 Raven - RWKV-7B RNN's LLM Strikes Back
Raven - RWKV-7B RNN's LLM Strikes Back
Sam Witteveen
28 Talk to your CSV & Excel with LangChain
Talk to your CSV & Excel with LangChain
Sam Witteveen
29 Vicuna - 90% of ChatGPT quality by using a new dataset?
Vicuna - 90% of ChatGPT quality by using a new dataset?
Sam Witteveen
30 Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍
Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍
Sam Witteveen
31 Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)
Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)
Sam Witteveen
32 BabyAGI: Discover the Power of Task-Driven Autonomous Agents!
BabyAGI: Discover the Power of Task-Driven Autonomous Agents!
Sam Witteveen
33 Auto-GPT - How to Automate a Task Based AI with GPT-4
Auto-GPT - How to Automate a Task Based AI with GPT-4
Sam Witteveen
34 Improve your BabyAGI with LangChain
Improve your BabyAGI with LangChain
Sam Witteveen
35 Generative Agents - Deep Dive and GPT-4 Recreation
Generative Agents - Deep Dive and GPT-4 Recreation
Sam Witteveen
36 GPT4ALLv2: The Improvements and Drawbacks You Need to Know!
GPT4ALLv2: The Improvements and Drawbacks You Need to Know!
Sam Witteveen
37 Dolly 2.0 by Databricks: Open for Business but is it  Ready to Impress!
Dolly 2.0 by Databricks: Open for Business but is it Ready to Impress!
Sam Witteveen
38 Red Pajama - Operation: Freeing LLaMA
Red Pajama - Operation: Freeing LLaMA
Sam Witteveen
39 Investigating Open Assistant - Models, Datasets and Addons
Investigating Open Assistant - Models, Datasets and Addons
Sam Witteveen
40 Investigating MiniGPT-4 - The Secret behind GPT-V?
Investigating MiniGPT-4 - The Secret behind GPT-V?
Sam Witteveen
41 Stable LM 3B - The new tiny kid on the block.
Stable LM 3B - The new tiny kid on the block.
Sam Witteveen
42 Bard can now code and put that code in Colab for you.
Bard can now code and put that code in Colab for you.
Sam Witteveen
43 Checking out Bark: a Text to Speech system by Suno AI
Checking out Bark: a Text to Speech system by Suno AI
Sam Witteveen
44 Fine-tuning LLMs with PEFT and LoRA
Fine-tuning LLMs with PEFT and LoRA
Sam Witteveen
45 Master PDF Chat with LangChain - Your essential guide to queries on documents
Master PDF Chat with LangChain - Your essential guide to queries on documents
Sam Witteveen
46 Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools
Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools
Sam Witteveen
47 Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)
Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)
Sam Witteveen
48 StableVicuna: The New King of Open ChatGPTs?
StableVicuna: The New King of Open ChatGPTs?
Sam Witteveen
49 WizardLM: Evolving Instruction Datasets to Create a Better Model
WizardLM: Evolving Instruction Datasets to Create a Better Model
Sam Witteveen
50 LaMini-LM - Mini Models Maxi Data!
LaMini-LM - Mini Models Maxi Data!
Sam Witteveen
51 Finding the Best Free ChatGPT
Finding the Best Free ChatGPT
Sam Witteveen
52 MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model
MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model
Sam Witteveen
53 LangChain Retrieval QA Over Multiple Files with ChromaDB
LangChain Retrieval QA Over Multiple Files with ChromaDB
Sam Witteveen
54 LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs
LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs
Sam Witteveen
55 LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!
LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!
Sam Witteveen
56 Transformers Agent - Is this Hugging Face's LangChain Competitor?
Transformers Agent - Is this Hugging Face's LangChain Competitor?
Sam Witteveen
57 StarCoder - The LLM to make you a coding star?
StarCoder - The LLM to make you a coding star?
Sam Witteveen
58 Testing Starcoder for Reasoning with PAL
Testing Starcoder for Reasoning with PAL
Sam Witteveen
59 The New Wizards - Unfiltered & Unaligned
The New Wizards - Unfiltered & Unaligned
Sam Witteveen
60 Camel + LangChain for Synthetic Data & Market Research
Camel + LangChain for Synthetic Data & Market Research
Sam Witteveen

This video teaches how to implement Program-aided Language Models using LangChain, providing code examples and demonstrations of its capabilities. It covers the PAL paper, example prompts, and code walkthroughs.

Key Takeaways
  1. Access the PAL paper and LangChain code
  2. Understand the concept of Program-aided Language Models
  3. Implement PAL using LangChain
  4. Test and evaluate PAL performance
💡 Program-aided Language Models can improve LLM performance and capabilities, and LangChain provides a convenient framework for implementing and testing PAL.

Related AI Lessons

Your LLM Doesn’t Pick Stocks — It Remembers Them
Discover how LLMs remember stock picks rather than making actual predictions, and why this matters for AI-driven investment strategies
Medium · Machine Learning
Word Representation
Learn how word representation works in NLP and its importance in understanding human language, enabling applications like text classification and language translation
Medium · NLP
When Cosine Similarity Approaching Singularity in Google Search AI Mode
Learn how cosine similarity approaching singularity affects Google Search AI and unified knowledge graphs, and why it matters for AI engineers and data scientists
Medium · AI
When Cosine Similarity Approaching Singularity in Google Search AI Mode
Learn how cosine similarity approaching singularity affects Google Search AI and unified knowledge graphs, and why it matters for data science and AI development
Medium · Data Science

Chapters (10)

PAL Paper
2:35 PAL example prompts in the paper
4:52 Paper Results vs CoT & Direct prompting
5:37 Sample Outputs
7:34 Code Walkthrough
9:35 PAL LangChain PromptTemplate
11:57 GSM8k examples
12:55 Repeat Copy examples
14:06 Word Math Problems
16:00 Colored Objects
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →