Prompt Engineering for Code Generation

Automata Learning Lab · Beginner ·🧠 Large Language Models ·2y ago

Skills: Prompt Craft90%LLM Engineering80%Advanced Prompting80%Fine-tuning LLMs70%Agent Foundations60%

Key Takeaways

The video demonstrates prompt engineering for code generation using large language models, specifically GPT-4, to create a simple Python function that performs operations between two numbers. It showcases the use of tools like GPT-4, CHP API, and unittest library to generate, execute, and test the generated code.

Full Transcript

in this video we're going to be looking at a code generation use case uh for prompt engineering now the problem we're going to be trying to T tackle is the problem of creating a simple python function that calculates uh uh some operation between two numbers all right and the reason why we Define a very simple task is because we're not really concerned about the code that's being generated but the process to use large language models to generate that code and how that can come about leveraging those steps that we've discussed previously so task definition to generate python code that can you know produce a function that calculates that takes in two integers and and a string that represents an operation and then Returns the result of that operation okay so for our evaluation metric uh we're going to be using test cases so essentially here there are many ways that we could do this and there are many different ways that we could evaluate the uh the quality let's say of the code generated however for to simplify things what I want is just to generate code to test it against some uh pretty welldefined set of test cases where if that function or that piece of code passes those test cases that means that we can trust that you know that function will be reliable this is not absolutely perfect obviously and there are many there are many ways that we could expand on top of this but but what I want to introduce is this concept of developing code by having test cases that have you know example inputs and outputs and then testing the code generated by some large language model by running the function generated by that model against the test cases predefined all right so in this case uh I'm going to set a set of test cases where the functions called calculate so it takes in uh a number like five and three and then uh the add symbol and the output the expected output is eight and you the same thing for a few examples and here we could have many many many examples and many different conditions where we want to see the function uh work well like for example uh handling other types of potential errors and exceptions Etc however to simplify things we're just going to have the test cases to be some simple use some simple examples all right and now we generate the prompts right and we're going to have two prompts one for the system message of the model and one for the prompt that will generate the code so for the system message of the model we're going to want a python code generation engine and we're going to say you'll be fed prompts with code descriptions or half finish code and generate the appropriate python code for the problem task described all right and the prompt to generate the code will be generate this entire python function and then I start the function and I continue the completion of the function by having these uh you know HTML style btics which uh is an approach that's known as structure prompting we've looked at this a little bit when we've discussed um output indicators so this is just a very fancy type of output indicator where I exemplify the structure in which I want the response to to be and this is actually uh inspired by a very famous tweet by Riley Goodside who is known um as the first ever prompt engineer and I really like some of his tweets I like how he thinks about prompt engineering you should check him out at atg Goodside if I'm not mistaken and this was inspired by his initial tweet uh for structured generation of python code using large language models so here I'm saying one line doc string for a python function to perform my arithmetic operations then a whitespace then some code and then the return statement for that function so the only python code inside of this uh prompt is this line and we want to generate some code out of that now this is an approach that I like because uh when we know what we want we can start the process for the model to simplify the completion remember when we talked about task specification that we want the prompt to constrain the model towards fulfilling the task right so this is what I'm trying to do here so this is going to be the prompt that we have have and then for the experimentation process it's going to be very simple so I have here the setup for the uh for calling the CHP API so I I input the system message for the model I input The Prompt and I get the answer and here I'll be using GPT 4 now notice here that we're going to be doing a simplified version and and we're not going to go through the process of setting parameters at the beginning and having an entire prompt engineering experiment Pipeline with tables and so on because we just want to see how the code generation would differ in terms of how we would inspect that the output is correct and you know potentially how an iteration would look like so this is going to be a very simplified example so I generate the python code right and if I print it with markdown this is what I get here's the complete python function as requested and I have here the function that uh was returned by the model all right so it's pretty good I'd say it's it looks pretty nice however ever you notice that this is marked down right which means that we need to extract just the python code from here there are many ways to do this uh we saw in the video about structured outputs that we could use pantic in link chain they have output parcel specialized for extracting python code from um uh from the outputs of a model like GPT chpt however to simplify things we're just going to do a very simple reject extraction of that python code and when we do that this is what it looks like if so I let's run this again so I'm running this not live because this is recorded but I'm running this right now and all right perfect so we get that function and now I'm going to generate the python code I'm going to get the python code and I'm going to just print it here so that you take a look so now we got just the python code perfect and now I'm going to execute the python code so so I'm going to be using Python's EAC method to execute the code that was generated by the model now remember this is for demonstration purposes and executing untrusted code is something that you have to be very careful about because that can lead to issues because the model might generate some code that you might not want to run on your machine so usually we would do this in an isolated environment or in a sandbox environment and there are many different approaches on how to set that up however just to simplify things this is just a simple calculation I'm going to be running this code right now and when I run the code it means that now I will have access in the environment of my uh jup notebook access to this function called calculate so when I run this I can now without having specified in a traditional jupter notebook cell I can run this code and test it to see if it's working so as you can see it's working great and it's making the calculation that I wanted so I don't need this one anymore and now that I have that function executed and implemented my environment I can actually um uh do a unit test to check that the um the function passes the test cases that I defined remember this is the evaluation metric this is what I use to define whether or not this is acceptable for use or not so when I run this I get a mistake because I didn't import unit test so let's import unit test perfect so uh when I run these test when I run this test I get that it works great and it passes my tests so this would be the criteria for me to say okay this is perfect this is actually uh running the test correctly now the way to iterate and evolve on this would be to if I want to add functionality I would add tests and then prompt the model to generate that function and then test it against the test that I prepared as the evaluation metric for that functionality and then incorporate that so have a model that can you know refactor the code and add it and add more stuff and then run a set of tests make sure that they pass the tests and so on and so forth until a point where I can you know make a pi push Etc uh so obviously um there's a lot of more complexity that we can add to that and there are a couple of papers that are really interesting one is called uh Aid driven development that came out earlier this year and another one that's being uh very popular recently is Alpha codium that essentially try to look at what an AI driven development pipeline would look like and it's extremely interesting because it builds on top of these simple ideas of having models to generate code and then having this code run against tests and then having a pipeline for checking that the the the code is appropriate and so on and so forth so I definitely recommend you check those papers out and that's it for this video uh on the next video we're going to be looking at a fun demo on how to understand research paper using uh prompt engineering so see you there cheers

Original Description

In this video, we delve into a fascinating code generation use case for prompt engineering. We tackle the task of creating a simple Python function that performs operations between two numbers using large language models. By setting up clear test cases, we evaluate the effectiveness of the generated code and ensure its reliability. We also touch on the iterative process of refining and expanding functionality through continuous testing. This insightful breakdown is perfect for anyone wanting to understand how to leverage prompt engineering for code generation. Thanks for watching! Cheers! 📚 Chapters: 00:00 - Introduction to Code Generation Use Case for Prompt Engineering 00:06 - Task Definition: Creating a Simple Python Function 00:17 - Importance of Process Over Code Generation 00:30 - Task Details: Inputs and Outputs of the Function 00:47 - Evaluation Metric: Using Test Cases 01:00 - Simplifying the Evaluation Process 01:31 - Developing Code with Test Cases 02:03 - Example Test Cases for the Function 02:28 - Generating Prompts 02:36 - System Message Setup for the Model 02:52 - Structure of the Code Generation Prompt 03:09 - Structure Prompting and Riley Goodside's Inspiration 03:54 - Explanation of the Code Inside the Prompt 04:12 - Using the CHP API and GPT-4 Setup 04:50 - Generating Python Code and Extracting It 05:29 - Printing and Executing the Generated Code 06:09 - Running the Generated Code 07:00 - Unit Testing to Validate the Generated Function 07:32 - Iteration and Evolution of the Code Using Tests 08:46 - Papers on AI-Driven Development Pipeline 09:24 - Conclusion and Preview of the Next Video 🔗 Links: - Source code: https://github.com/EnkrateiaLucca/oreilly-prompt-eng/blob/main/notebooks/3.2-code-generation-use-case.ipynb - Subscribe!: https://www.youtube.com/channel/UCu8WF59Scx9f3H1N_FgZUwQ - Tiktok: https://www.tiktok.com/@enkrateialucca?lang=en - Twitter: https://twitter.com/LucasEnkrateia - LinkedIn: https://www.linkedin.com/in/lucas-soares-96

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Automata Learning Lab · Automata Learning Lab · 0 of 60

← Previous Next →

A Quick Tutorial on NLP Basics

A Quick Tutorial on NLP Basics

Automata Learning Lab

Automating your Digital Morning Routine with Python

Automating your Digital Morning Routine with Python

Automata Learning Lab

Exploring Problem Solving with Python and Jupyter Notebook #1

Exploring Problem Solving with Python and Jupyter Notebook #1

Automata Learning Lab

Summarize Papers with Python and GPT-3

Summarize Papers with Python and GPT-3

Automata Learning Lab

An Experiment Tracking Tutorial with Mlflow and Keras

An Experiment Tracking Tutorial with Mlflow and Keras

Automata Learning Lab

Automating Google Forms Submissions with Python

Automating Google Forms Submissions with Python

Automata Learning Lab

Productivity Tracking With Python and the Notion API

Productivity Tracking With Python and the Notion API

Automata Learning Lab

When your Machine Learning Model Fails Do This ;p

When your Machine Learning Model Fails Do This ;p

Automata Learning Lab

Machine Learning Tip#1 Practical Deep Learning Course

Machine Learning Tip#1 Practical Deep Learning Course

Automata Learning Lab

Machine Learning Tips: Deep Learning Monitor

Machine Learning Tips: Deep Learning Monitor

Automata Learning Lab

Machine Learning Tips#5 MLOPs specialization in Coursera #machinelearning

Machine Learning Tips#5 MLOPs specialization in Coursera #machinelearning

Automata Learning Lab

Automatically Changing Desktop Wallpaper with Python and the Nasa Image API

Automatically Changing Desktop Wallpaper with Python and the Nasa Image API

Automata Learning Lab

Building an Image Classifier to Filter Out Unused Images From Your Photo Album with Machine Learning

Building an Image Classifier to Filter Out Unused Images From Your Photo Album with Machine Learning

Automata Learning Lab

Automating VS Code Snippets with Python

Automating VS Code Snippets with Python

Automata Learning Lab

How to Set Up a Machine Learning Environment with Conda and Pip-Tools

How to Set Up a Machine Learning Environment with Conda and Pip-Tools

Automata Learning Lab

9 Google Search Tips for Machine Learning

9 Google Search Tips for Machine Learning

Automata Learning Lab

Automata Learning Lab

Automating Car Search with Python and Data Science

Automating Car Search with Python and Data Science

Automata Learning Lab

Generating Images from Text with Stable Diffusion and Hugging Face

Generating Images from Text with Stable Diffusion and Hugging Face

Automata Learning Lab

A Practical Introduction to Data Science using the Spaceship Titanic Dataset from Kaggle

A Practical Introduction to Data Science using the Spaceship Titanic Dataset from Kaggle

Automata Learning Lab

Jiu Jitsu App with Python and Streamlit

Jiu Jitsu App with Python and Streamlit

Automata Learning Lab

2 Apps for Coding In The Ipad Pro

2 Apps for Coding In The Ipad Pro

Automata Learning Lab

From Tensorflow to Pytorch?

From Tensorflow to Pytorch?

Automata Learning Lab

Building an Audio Transcription App with OpenAI Whisper and Streamlit

Building an Audio Transcription App with OpenAI Whisper and Streamlit

Automata Learning Lab

Productivity Tracking with Python Short Summary

Productivity Tracking with Python Short Summary

Automata Learning Lab

Automating Expense Reports with Python

Automating Expense Reports with Python

Automata Learning Lab

ChatGPT, Angry Pandas and AI Code

ChatGPT, Angry Pandas and AI Code

Automata Learning Lab

7 Strategies To Learn Anything Using ChatGPT

7 Strategies To Learn Anything Using ChatGPT

Automata Learning Lab

Building a Thought Summarization App with Whisper and GPT3

Building a Thought Summarization App with Whisper and GPT3

Automata Learning Lab

Visualize a Neural Net Learning Polynomial Functions

Visualize a Neural Net Learning Polynomial Functions

Automata Learning Lab

Automating Notion with Python

Automating Notion with Python

Automata Learning Lab

Pose Tracking for Jiu Jitsu - Update #jiujitsu #machinelearning

Pose Tracking for Jiu Jitsu - Update #jiujitsu #machinelearning

Automata Learning Lab

Update to my Pose Tracking for Jiu Jitsu Project #machinelearning #jiujitsu #ai #deeplearning

Update to my Pose Tracking for Jiu Jitsu Project #machinelearning #jiujitsu #ai #deeplearning

Automata Learning Lab

ChatGPT API Released by OpenAI

ChatGPT API Released by OpenAI

Automata Learning Lab

ChatGPT API Response Format #machinelearning #ai #datascience

ChatGPT API Response Format #machinelearning #ai #datascience

Automata Learning Lab

Beyond Stable Diffusion with Composer | Automata Learning Lab Paper Series #1

Beyond Stable Diffusion with Composer | Automata Learning Lab Paper Series #1

Automata Learning Lab

Beyond Diffusion Models with Composer #machinelearning #ai

Beyond Diffusion Models with Composer #machinelearning #ai

Automata Learning Lab

Machine Learning for Jiu Jitsu

Machine Learning for Jiu Jitsu

Automata Learning Lab

Prompt Engineering Basics #machinelearning #gpt4 #chatgpt

Prompt Engineering Basics #machinelearning #gpt4 #chatgpt

Automata Learning Lab

Visual ChatGPT: Integrating Images with ChatGPT Paper Series#2

Visual ChatGPT: Integrating Images with ChatGPT Paper Series#2

Automata Learning Lab

Visual ChatGPT #machinelearning #ai #artificialintelligence

Visual ChatGPT #machinelearning #ai #artificialintelligence

Automata Learning Lab

LERF - Language Embeddings + NERF for Querying 3D Spaces #machinelearning #ai

LERF - Language Embeddings + NERF for Querying 3D Spaces #machinelearning #ai

Automata Learning Lab

Summarize Papers with Python and ChatGPT

Summarize Papers with Python and ChatGPT

Automata Learning Lab

Large Language Models can use Tools Now! #artificialintelligence #machinelearning #ai

Large Language Models can use Tools Now! #artificialintelligence #machinelearning #ai

Automata Learning Lab

Sparks of AGI in GPT4? #machinelearning #ai #agi #artificialintelligence

Sparks of AGI in GPT4? #machinelearning #ai #agi #artificialintelligence

Automata Learning Lab

Toolformer: LLMs can use Tools! #chatgpt #llms #gpt4 #gpt3 #artificialintelligence

Toolformer: LLMs can use Tools! #chatgpt #llms #gpt4 #gpt3 #artificialintelligence

Automata Learning Lab

Talking to Your Notes with LangChain #artificialintelligence #llms #gpt4 #chatgpt

Talking to Your Notes with LangChain #artificialintelligence #llms #gpt4 #chatgpt

Automata Learning Lab

How to Talk to a PDF using LangChain and ChatGPT

How to Talk to a PDF using LangChain and ChatGPT

Automata Learning Lab

Query Your Own Notes With LangChain

Query Your Own Notes With LangChain

Automata Learning Lab

HuggingGPT #machinelearning #artificialintelligence #huggingface #gpt4 #chatgpt

HuggingGPT #machinelearning #artificialintelligence #huggingface #gpt4 #chatgpt

Automata Learning Lab

Do as I Can Not as I Say Paper #artificialintelligence #llms #reinforcementlearning

Do as I Can Not as I Say Paper #artificialintelligence #llms #reinforcementlearning

Automata Learning Lab

Automating Anki Flashcards with OpenAI and GPT-4

Automating Anki Flashcards with OpenAI and GPT-4

Automata Learning Lab

Building A PDF Summarization App with Gradio and LangChain

Building A PDF Summarization App with Gradio and LangChain

Automata Learning Lab

Auto-GPT #artificialintelligence #gpt4 #llms #autogpt

Auto-GPT #artificialintelligence #gpt4 #llms #autogpt

Automata Learning Lab

DocGPT - Chat with Github #artificialintelligence #gpt4 #chatgpt

DocGPT - Chat with Github #artificialintelligence #gpt4 #chatgpt

Automata Learning Lab

LLMs for Research and Planning #artificialintelligence #gpt4 #llms

LLMs for Research and Planning #artificialintelligence #gpt4 #llms

Automata Learning Lab

How I Use ChatGPT for Interactive Language Learning

How I Use ChatGPT for Interactive Language Learning

Automata Learning Lab

Building an Audio Transcription App with Gradio and Whisper

Building an Audio Transcription App with Gradio and Whisper

Automata Learning Lab

Summarizing and Querying Multiple Papers with LangChain

Summarizing and Querying Multiple Papers with LangChain

Automata Learning Lab

Mojo - The New AI Programming Language?

Mojo - The New AI Programming Language?

Automata Learning Lab

This video teaches the basics of prompt engineering for code generation using large language models, including how to craft effective prompts, generate and execute code, and test and evaluate the results. It also showcases the use of various tools and technologies, such as GPT-4 and unittest library, to support the code generation process.

Key Takeaways

Generate Python code using GPT-4
Extract Python code from markdown output using regex
Execute generated Python code using Python's EAC method
Run unit tests on generated code using unittest library
Imports unit test to fix mistake
Adds tests and prompts model to generate code and test it
Refactors code and adds functionality
Runs tests to ensure code is correct

💡 The use of unit tests as an evaluation metric for code generation can significantly improve the quality and reliability of the generated code.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Prompt Craft

View skill →

Build Hour: Prompt Caching

Build Hour: Prompt Caching

Advanced Prompt Engineering Course

Advanced Prompt Engineering Course

Organizing Your AI Prompts with Jinja Templates with ChatGPT & OpenAI

Organizing Your AI Prompts with Jinja Templates with ChatGPT & OpenAI

Automata Learning Lab

Creating a Game Prototype with Amazon Q and Amazon Bedrock (Prompt Engineering on AWS)

Creating a Game Prototype with Amazon Q and Amazon Bedrock (Prompt Engineering on AWS)

Switch from ChatGPT to Claude in 5 Minutes (Without Losing Your Memory)

Switch from ChatGPT to Claude in 5 Minutes (Without Losing Your Memory)

Create End to End AI Chatbot using Lovable.dev in 5 Mins!

Create End to End AI Chatbot using Lovable.dev in 5 Mins!

Related AI Lessons

I Asked ChatGPT to Fix My Life. It Couldn’t — Until I Changed One Thing

Learn how to effectively use AI like ChatGPT to improve your life by changing your approach

I Asked ChatGPT to Fix My Life. It Couldn’t — Until I Changed One Thing

Learn how to effectively use ChatGPT to solve personal problems by changing your approach

Medium · ChatGPT

Claude Sonnet 5 Is Here: Why It Might Replace Your Opus Subscription

Learn about Claude Sonnet 5, a new AI model that offers near-flagship performance at a lower price, and its potential to replace Opus subscriptions

Medium · Programming

Introducing Claude Sonnet 5 on AWS: Anthropic’s most capable Sonnet model

Learn about Claude Sonnet 5, Anthropic's most advanced Sonnet model, now available on AWS, and how it delivers top-tier intelligence for coding, agents, and professional tasks

AWS Machine Learning

Chapters (21)

Introduction to Code Generation Use Case for Prompt Engineering

0:06 Task Definition: Creating a Simple Python Function

0:17 Importance of Process Over Code Generation

0:30 Task Details: Inputs and Outputs of the Function

0:47 Evaluation Metric: Using Test Cases

1:00 Simplifying the Evaluation Process

1:31 Developing Code with Test Cases

2:03 Example Test Cases for the Function

2:28 Generating Prompts

2:36 System Message Setup for the Model

2:52 Structure of the Code Generation Prompt

3:09 Structure Prompting and Riley Goodside's Inspiration

3:54 Explanation of the Code Inside the Prompt

4:12 Using the CHP API and GPT-4 Setup

4:50 Generating Python Code and Extracting It

5:29 Printing and Executing the Generated Code

6:09 Running the Generated Code

7:00 Unit Testing to Validate the Generated Function

7:32 Iteration and Evolution of the Code Using Tests

8:46 Papers on AI-Driven Development Pipeline

9:24 Conclusion and Preview of the Next Video

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)