Prompt Engineering for Code Generation

Automata Learning Lab · Beginner ·🧠 Large Language Models ·2y ago

Key Takeaways

The video demonstrates prompt engineering for code generation using large language models, specifically GPT-4, to create a simple Python function that performs operations between two numbers. It showcases the use of tools like GPT-4, CHP API, and unittest library to generate, execute, and test the generated code.

Full Transcript

in this video we're going to be looking at a code generation use case uh for prompt engineering now the problem we're going to be trying to T tackle is the problem of creating a simple python function that calculates uh uh some operation between two numbers all right and the reason why we Define a very simple task is because we're not really concerned about the code that's being generated but the process to use large language models to generate that code and how that can come about leveraging those steps that we've discussed previously so task definition to generate python code that can you know produce a function that calculates that takes in two integers and and a string that represents an operation and then Returns the result of that operation okay so for our evaluation metric uh we're going to be using test cases so essentially here there are many ways that we could do this and there are many different ways that we could evaluate the uh the quality let's say of the code generated however for to simplify things what I want is just to generate code to test it against some uh pretty welldefined set of test cases where if that function or that piece of code passes those test cases that means that we can trust that you know that function will be reliable this is not absolutely perfect obviously and there are many there are many ways that we could expand on top of this but but what I want to introduce is this concept of developing code by having test cases that have you know example inputs and outputs and then testing the code generated by some large language model by running the function generated by that model against the test cases predefined all right so in this case uh I'm going to set a set of test cases where the functions called calculate so it takes in uh a number like five and three and then uh the add symbol and the output the expected output is eight and you the same thing for a few examples and here we could have many many many examples and many different conditions where we want to see the function uh work well like for example uh handling other types of potential errors and exceptions Etc however to simplify things we're just going to have the test cases to be some simple use some simple examples all right and now we generate the prompts right and we're going to have two prompts one for the system message of the model and one for the prompt that will generate the code so for the system message of the model we're going to want a python code generation engine and we're going to say you'll be fed prompts with code descriptions or half finish code and generate the appropriate python code for the problem task described all right and the prompt to generate the code will be generate this entire python function and then I start the function and I continue the completion of the function by having these uh you know HTML style btics which uh is an approach that's known as structure prompting we've looked at this a little bit when we've discussed um output indicators so this is just a very fancy type of output indicator where I exemplify the structure in which I want the response to to be and this is actually uh inspired by a very famous tweet by Riley Goodside who is known um as the first ever prompt engineer and I really like some of his tweets I like how he thinks about prompt engineering you should check him out at atg Goodside if I'm not mistaken and this was inspired by his initial tweet uh for structured generation of python code using large language models so here I'm saying one line doc string for a python function to perform my arithmetic operations then a whitespace then some code and then the return statement for that function so the only python code inside of this uh prompt is this line and we want to generate some code out of that now this is an approach that I like because uh when we know what we want we can start the process for the model to simplify the completion remember when we talked about task specification that we want the prompt to constrain the model towards fulfilling the task right so this is what I'm trying to do here so this is going to be the prompt that we have have and then for the experimentation process it's going to be very simple so I have here the setup for the uh for calling the CHP API so I I input the system message for the model I input The Prompt and I get the answer and here I'll be using GPT 4 now notice here that we're going to be doing a simplified version and and we're not going to go through the process of setting parameters at the beginning and having an entire prompt engineering experiment Pipeline with tables and so on because we just want to see how the code generation would differ in terms of how we would inspect that the output is correct and you know potentially how an iteration would look like so this is going to be a very simplified example so I generate the python code right and if I print it with markdown this is what I get here's the complete python function as requested and I have here the function that uh was returned by the model all right so it's pretty good I'd say it's it looks pretty nice however ever you notice that this is marked down right which means that we need to extract just the python code from here there are many ways to do this uh we saw in the video about structured outputs that we could use pantic in link chain they have output parcel specialized for extracting python code from um uh from the outputs of a model like GPT chpt however to simplify things we're just going to do a very simple reject extraction of that python code and when we do that this is what it looks like if so I let's run this again so I'm running this not live because this is recorded but I'm running this right now and all right perfect so we get that function and now I'm going to generate the python code I'm going to get the python code and I'm going to just print it here so that you take a look so now we got just the python code perfect and now I'm going to execute the python code so so I'm going to be using Python's EAC method to execute the code that was generated by the model now remember this is for demonstration purposes and executing untrusted code is something that you have to be very careful about because that can lead to issues because the model might generate some code that you might not want to run on your machine so usually we would do this in an isolated environment or in a sandbox environment and there are many different approaches on how to set that up however just to simplify things this is just a simple calculation I'm going to be running this code right now and when I run the code it means that now I will have access in the environment of my uh jup notebook access to this function called calculate so when I run this I can now without having specified in a traditional jupter notebook cell I can run this code and test it to see if it's working so as you can see it's working great and it's making the calculation that I wanted so I don't need this one anymore and now that I have that function executed and implemented my environment I can actually um uh do a unit test to check that the um the function passes the test cases that I defined remember this is the evaluation metric this is what I use to define whether or not this is acceptable for use or not so when I run this I get a mistake because I didn't import unit test so let's import unit test perfect so uh when I run these test when I run this test I get that it works great and it passes my tests so this would be the criteria for me to say okay this is perfect this is actually uh running the test correctly now the way to iterate and evolve on this would be to if I want to add functionality I would add tests and then prompt the model to generate that function and then test it against the test that I prepared as the evaluation metric for that functionality and then incorporate that so have a model that can you know refactor the code and add it and add more stuff and then run a set of tests make sure that they pass the tests and so on and so forth until a point where I can you know make a pi push Etc uh so obviously um there's a lot of more complexity that we can add to that and there are a couple of papers that are really interesting one is called uh Aid driven development that came out earlier this year and another one that's being uh very popular recently is Alpha codium that essentially try to look at what an AI driven development pipeline would look like and it's extremely interesting because it builds on top of these simple ideas of having models to generate code and then having this code run against tests and then having a pipeline for checking that the the the code is appropriate and so on and so forth so I definitely recommend you check those papers out and that's it for this video uh on the next video we're going to be looking at a fun demo on how to understand research paper using uh prompt engineering so see you there cheers

Original Description

In this video, we delve into a fascinating code generation use case for prompt engineering. We tackle the task of creating a simple Python function that performs operations between two numbers using large language models. By setting up clear test cases, we evaluate the effectiveness of the generated code and ensure its reliability. We also touch on the iterative process of refining and expanding functionality through continuous testing. This insightful breakdown is perfect for anyone wanting to understand how to leverage prompt engineering for code generation. Thanks for watching! Cheers! 📚 Chapters: 00:00 - Introduction to Code Generation Use Case for Prompt Engineering 00:06 - Task Definition: Creating a Simple Python Function 00:17 - Importance of Process Over Code Generation 00:30 - Task Details: Inputs and Outputs of the Function 00:47 - Evaluation Metric: Using Test Cases 01:00 - Simplifying the Evaluation Process 01:31 - Developing Code with Test Cases 02:03 - Example Test Cases for the Function 02:28 - Generating Prompts 02:36 - System Message Setup for the Model 02:52 - Structure of the Code Generation Prompt 03:09 - Structure Prompting and Riley Goodside's Inspiration 03:54 - Explanation of the Code Inside the Prompt 04:12 - Using the CHP API and GPT-4 Setup 04:50 - Generating Python Code and Extracting It 05:29 - Printing and Executing the Generated Code 06:09 - Running the Generated Code 07:00 - Unit Testing to Validate the Generated Function 07:32 - Iteration and Evolution of the Code Using Tests 08:46 - Papers on AI-Driven Development Pipeline 09:24 - Conclusion and Preview of the Next Video 🔗 Links: - Source code: https://github.com/EnkrateiaLucca/oreilly-prompt-eng/blob/main/notebooks/3.2-code-generation-use-case.ipynb - Subscribe!: https://www.youtube.com/channel/UCu8WF59Scx9f3H1N_FgZUwQ - Tiktok: https://www.tiktok.com/@enkrateialucca?lang=en - Twitter: https://twitter.com/LucasEnkrateia - LinkedIn: https://www.linkedin.com/in/lucas-soares-96
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Automata Learning Lab · Automata Learning Lab · 0 of 60

← Previous Next →
1 A Quick Tutorial on NLP Basics
A Quick Tutorial on NLP Basics
Automata Learning Lab
2 Automating your Digital Morning Routine with Python
Automating your Digital Morning Routine with Python
Automata Learning Lab
3 Exploring Problem Solving with Python and Jupyter Notebook #1
Exploring Problem Solving with Python and Jupyter Notebook #1
Automata Learning Lab
4 Summarize Papers with Python and GPT-3
Summarize Papers with Python and GPT-3
Automata Learning Lab
5 An Experiment Tracking Tutorial with Mlflow and Keras
An Experiment Tracking Tutorial with Mlflow and Keras
Automata Learning Lab
6 Automating Google Forms Submissions with Python
Automating Google Forms Submissions with Python
Automata Learning Lab
7 Productivity Tracking With Python and the Notion API
Productivity Tracking With Python and the Notion API
Automata Learning Lab
8 When your Machine Learning Model Fails Do This ;p
When your Machine Learning Model Fails Do This ;p
Automata Learning Lab
9 Machine Learning Tip#1 Practical Deep Learning Course
Machine Learning Tip#1 Practical Deep Learning Course
Automata Learning Lab
10 Machine Learning Tips: Deep Learning Monitor
Machine Learning Tips: Deep Learning Monitor
Automata Learning Lab
11 Machine Learning Tips#5 MLOPs specialization in Coursera #machinelearning
Machine Learning Tips#5 MLOPs specialization in Coursera #machinelearning
Automata Learning Lab
12 Automatically Changing Desktop Wallpaper with Python and the Nasa Image API
Automatically Changing Desktop Wallpaper with Python and the Nasa Image API
Automata Learning Lab
13 Building an Image Classifier to Filter Out Unused Images From Your Photo Album with Machine Learning
Building an Image Classifier to Filter Out Unused Images From Your Photo Album with Machine Learning
Automata Learning Lab
14 Automating VS Code Snippets with Python
Automating VS Code Snippets with Python
Automata Learning Lab
15 How to Set Up a Machine Learning Environment with Conda and Pip-Tools
How to Set Up a Machine Learning Environment with Conda and Pip-Tools
Automata Learning Lab
16 9 Google Search Tips for Machine Learning
9 Google Search Tips for Machine Learning
Automata Learning Lab
17 Thinking Tools
Thinking Tools
Automata Learning Lab
18 Automating Car Search with Python and Data Science
Automating Car Search with Python and Data Science
Automata Learning Lab
19 Generating Images from Text with Stable Diffusion and Hugging Face
Generating Images from Text with Stable Diffusion and Hugging Face
Automata Learning Lab
20 A Practical Introduction to Data Science using the Spaceship Titanic Dataset from Kaggle
A Practical Introduction to Data Science using the Spaceship Titanic Dataset from Kaggle
Automata Learning Lab
21 Jiu Jitsu App with Python and Streamlit
Jiu Jitsu App with Python and Streamlit
Automata Learning Lab
22 2 Apps for Coding In The Ipad Pro
2 Apps for Coding In The Ipad Pro
Automata Learning Lab
23 From Tensorflow to Pytorch?
From Tensorflow to Pytorch?
Automata Learning Lab
24 Building an Audio Transcription App with OpenAI Whisper and Streamlit
Building an Audio Transcription App with OpenAI Whisper and Streamlit
Automata Learning Lab
25 Productivity Tracking with Python Short Summary
Productivity Tracking with Python Short Summary
Automata Learning Lab
26 Automating Expense Reports with Python
Automating Expense Reports with Python
Automata Learning Lab
27 ChatGPT, Angry Pandas and AI Code
ChatGPT, Angry Pandas and AI Code
Automata Learning Lab
28 7 Strategies To Learn Anything Using ChatGPT
7 Strategies To Learn Anything Using ChatGPT
Automata Learning Lab
29 Building a Thought Summarization App with Whisper and GPT3
Building a Thought Summarization App with Whisper and GPT3
Automata Learning Lab
30 Visualize a Neural Net Learning Polynomial Functions
Visualize a Neural Net Learning Polynomial Functions
Automata Learning Lab
31 Automating Notion with Python
Automating Notion with Python
Automata Learning Lab
32 Pose Tracking for Jiu Jitsu - Update #jiujitsu #machinelearning
Pose Tracking for Jiu Jitsu - Update #jiujitsu #machinelearning
Automata Learning Lab
33 Update to my Pose Tracking for Jiu Jitsu Project #machinelearning #jiujitsu #ai #deeplearning
Update to my Pose Tracking for Jiu Jitsu Project #machinelearning #jiujitsu #ai #deeplearning
Automata Learning Lab
34 ChatGPT API Released by OpenAI
ChatGPT API Released by OpenAI
Automata Learning Lab
35 ChatGPT API Response Format #machinelearning #ai #datascience
ChatGPT API Response Format #machinelearning #ai #datascience
Automata Learning Lab
36 Beyond Stable Diffusion with Composer | Automata Learning Lab Paper Series #1
Beyond Stable Diffusion with Composer | Automata Learning Lab Paper Series #1
Automata Learning Lab
37 Beyond Diffusion Models with Composer #machinelearning #ai
Beyond Diffusion Models with Composer #machinelearning #ai
Automata Learning Lab
38 Machine Learning for Jiu Jitsu
Machine Learning for Jiu Jitsu
Automata Learning Lab
39 Prompt Engineering Basics #machinelearning #gpt4 #chatgpt
Prompt Engineering Basics #machinelearning #gpt4 #chatgpt
Automata Learning Lab
40 Visual ChatGPT: Integrating Images with ChatGPT Paper Series#2
Visual ChatGPT: Integrating Images with ChatGPT Paper Series#2
Automata Learning Lab
41 Visual ChatGPT #machinelearning #ai #artificialintelligence
Visual ChatGPT #machinelearning #ai #artificialintelligence
Automata Learning Lab
42 LERF - Language Embeddings + NERF for Querying 3D Spaces #machinelearning #ai
LERF - Language Embeddings + NERF for Querying 3D Spaces #machinelearning #ai
Automata Learning Lab
43 Summarize Papers with Python and ChatGPT
Summarize Papers with Python and ChatGPT
Automata Learning Lab
44 Large Language Models can use Tools Now! #artificialintelligence #machinelearning #ai
Large Language Models can use Tools Now! #artificialintelligence #machinelearning #ai
Automata Learning Lab
45 Sparks of AGI in GPT4? #machinelearning #ai #agi #artificialintelligence
Sparks of AGI in GPT4? #machinelearning #ai #agi #artificialintelligence
Automata Learning Lab
46 Toolformer: LLMs can use Tools! #chatgpt #llms #gpt4 #gpt3 #artificialintelligence
Toolformer: LLMs can use Tools! #chatgpt #llms #gpt4 #gpt3 #artificialintelligence
Automata Learning Lab
47 Talking to Your Notes with LangChain #artificialintelligence #llms #gpt4 #chatgpt
Talking to Your Notes with LangChain #artificialintelligence #llms #gpt4 #chatgpt
Automata Learning Lab
48 How to Talk to a PDF using LangChain and ChatGPT
How to Talk to a PDF using LangChain and ChatGPT
Automata Learning Lab
49 Query Your Own Notes With LangChain
Query Your Own Notes With LangChain
Automata Learning Lab
50 HuggingGPT #machinelearning #artificialintelligence #huggingface #gpt4 #chatgpt
HuggingGPT #machinelearning #artificialintelligence #huggingface #gpt4 #chatgpt
Automata Learning Lab
51 Do as I Can Not as I Say Paper #artificialintelligence #llms #reinforcementlearning
Do as I Can Not as I Say Paper #artificialintelligence #llms #reinforcementlearning
Automata Learning Lab
52 Automating Anki Flashcards with OpenAI and GPT-4
Automating Anki Flashcards with OpenAI and GPT-4
Automata Learning Lab
53 Building A PDF Summarization App with  Gradio and LangChain
Building A PDF Summarization App with Gradio and LangChain
Automata Learning Lab
54 Auto-GPT #artificialintelligence #gpt4 #llms #autogpt
Auto-GPT #artificialintelligence #gpt4 #llms #autogpt
Automata Learning Lab
55 DocGPT - Chat with Github #artificialintelligence #gpt4 #chatgpt
DocGPT - Chat with Github #artificialintelligence #gpt4 #chatgpt
Automata Learning Lab
56 LLMs for Research and Planning #artificialintelligence #gpt4 #llms
LLMs for Research and Planning #artificialintelligence #gpt4 #llms
Automata Learning Lab
57 How I Use ChatGPT for Interactive Language Learning
How I Use ChatGPT for Interactive Language Learning
Automata Learning Lab
58 Building an Audio Transcription App with Gradio and Whisper
Building an Audio Transcription App with Gradio and Whisper
Automata Learning Lab
59 Summarizing and Querying Multiple Papers with LangChain
Summarizing and Querying Multiple Papers with LangChain
Automata Learning Lab
60 Mojo - The New AI Programming Language?
Mojo - The New AI Programming Language?
Automata Learning Lab

This video teaches the basics of prompt engineering for code generation using large language models, including how to craft effective prompts, generate and execute code, and test and evaluate the results. It also showcases the use of various tools and technologies, such as GPT-4 and unittest library, to support the code generation process.

Key Takeaways
  1. Generate Python code using GPT-4
  2. Extract Python code from markdown output using regex
  3. Execute generated Python code using Python's EAC method
  4. Run unit tests on generated code using unittest library
  5. Imports unit test to fix mistake
  6. Adds tests and prompts model to generate code and test it
  7. Refactors code and adds functionality
  8. Runs tests to ensure code is correct
💡 The use of unit tests as an evaluation metric for code generation can significantly improve the quality and reliability of the generated code.

Related AI Lessons

I Asked ChatGPT to Fix My Life. It Couldn’t — Until I Changed One Thing
Learn how to effectively use AI like ChatGPT to improve your life by changing your approach
Medium · AI
I Asked ChatGPT to Fix My Life. It Couldn’t — Until I Changed One Thing
Learn how to effectively use ChatGPT to solve personal problems by changing your approach
Medium · ChatGPT
Claude Sonnet 5 Is Here: Why It Might Replace Your Opus Subscription
Learn about Claude Sonnet 5, a new AI model that offers near-flagship performance at a lower price, and its potential to replace Opus subscriptions
Medium · Programming
Introducing Claude Sonnet 5 on AWS: Anthropic’s most capable Sonnet model
Learn about Claude Sonnet 5, Anthropic's most advanced Sonnet model, now available on AWS, and how it delivers top-tier intelligence for coding, agents, and professional tasks
AWS Machine Learning

Chapters (21)

Introduction to Code Generation Use Case for Prompt Engineering
0:06 Task Definition: Creating a Simple Python Function
0:17 Importance of Process Over Code Generation
0:30 Task Details: Inputs and Outputs of the Function
0:47 Evaluation Metric: Using Test Cases
1:00 Simplifying the Evaluation Process
1:31 Developing Code with Test Cases
2:03 Example Test Cases for the Function
2:28 Generating Prompts
2:36 System Message Setup for the Model
2:52 Structure of the Code Generation Prompt
3:09 Structure Prompting and Riley Goodside's Inspiration
3:54 Explanation of the Code Inside the Prompt
4:12 Using the CHP API and GPT-4 Setup
4:50 Generating Python Code and Extracting It
5:29 Printing and Executing the Generated Code
6:09 Running the Generated Code
7:00 Unit Testing to Validate the Generated Function
7:32 Iteration and Evolution of the Code Using Tests
8:46 Papers on AI-Driven Development Pipeline
9:24 Conclusion and Preview of the Next Video
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →