NEW WizardCoder Python 34B LLM is AMAZING!!!

1littlecoder · Intermediate ·🧠 Large Language Models ·2y ago

Skills: LLM Foundations90%Fine-tuning LLMs80%LLM Engineering70%

Key Takeaways

The WizardCoder Python 34B LLM model achieves a 73.2 pass@1 score, surpassing GPT4's initial reported score, but not its latest API score of 82. The model is fine-tuned based on the Llama 2 or Code Llama model and is available for download on Hugging Face.

Full Transcript

coder pipes on 34 billion parameter model is the latest model to join the family of wizard coded models the model has been in the news because there have been claims that this model has knocked gpt4 out of the park it is partly true it is partly not true and in this video I'm going to explore why it is partly true and why it is not partly true but moreover we're going to also explore why this model has been discussed so much about the first thing to start with this wizard coder is a family of models every time there is a new model or a new data set comes in Wizard coder manages to kind of update their existing model and then stay always at the top and now wizard coder has got the latest version which is a python specific 34 billion parameter model and that is based on code Lama so they've used code Lama and they've fine-tuned their existing wizard decoder model and that model is beating gpd4 how is it beating gpd4 before I even get into the model in itself let's first clear this thing so what the news is that jeep wizard coder python 34 billion parameter model achieved 73.2 on a human evil plus one and that is surpassing gpt4 but how it is surpassing gpt4 is something that you need to keep in mind the first thing for you to know is when GPT 4 was launched open AI reported a particular human evil score and that score was 62 is the GPT Force code then you know for sure that wizard coder and lot of other models have beaten gpd4 but is that the entire score no when the wizard coder team try to replicate the gpd4 or replicate the benchmarks for the latest API of gpd4 that has code 82 so have wizard coder python 34 billion parameter model scored more than 82 no absolutely not it has not scored more than 82. it has surpassed what openai reported at the start so there are two things to it let me clear it when openai launch gpd4 there was a score that they reported and wizard coder and a bunch of other models have overcome it when these researchers tried to replicate or calculate the Benchmark score for gpt4 with the latest API then that is code 82 which is like way above every other open source model that is available having said this if you take this headline just simply out that wizard coder 34 billion parameter python model has beaten open AI gpd4 if you just remove the title wizard coder 34 billion parameter model python is still the best the best python or coding model that you could see primary reason is because this has been fine-tuned based on the Llama 2 or the code Lama model that was released couple of days back much more promising aspect of this entire piece is that this is a 34 billion parameter model this is not a 70 billion parameter model this is not a mixture of experts this is just a single single 34 billion parameter model that is doing extremely well on the human eval benchmarks now if you are going to ask me hey do you trust every single Benchmark that is available I would say probably no I don't trust benchmarks I would expect people to give the models in the hands of other people for them to try it out and that is exactly what the team has done they have uploaded the model on hugging face modeler the model is available for you to directly download the checkpoint is available you can go here click files and versions there is no terms of services there is no form that you have to fill in directly go there and download and if you have got a big enough GPU like I think you need a really good GPU then you can directly check this model and then try it out I've heard from couple of places like Hacker News where people have felt that this model and every other code llama derivative model is doing really good and they're happy with the performance that they've gotten when you compare it with gpd4 but unfortunately I could not make the comparison because the radio application was running forever and in fact like I waited for a long time to see if this task would finish but didn't finish so if you happen to run this model locally please let me know what you feel about it but as a matter of fact that we have an open source model that is way way above every other open source model is first of all a great advantage and in fact if you see this open source model which is code 73.2 this is a really good score so overall the point is very simple wizard coder python 34 billion parameter model is really one of the best open source models that you could use today and that that is like scoring much more than code llama python model I think that's the power of fine tuning here but does it beat gpt4 no it doesn't beat gpd4 in its current format the fact that this comes closer to gpd4 is quite an amazing achievement so having said that like having that we have cleared all these questions if you want to use this model yourself all you have to do is go download here I think this clears up the entire Buzz around wizard coder model or any other beating gpd4 whenever you see a news that some model has beaten gpd4 make sure that you check which version of gpt4 they are talking about what is a benchmark what is a benchmark that they have used have they trained the model has there been a data leakage I mean this is a good thing for you to try it out but the fact that this research is very honestly put out that they got 82 and 72.5 when they tested their self with the latest API I think this is a commendable thing I wanted to really appreciate them for that and as a matter of fact that you can use this model that is available open source for you to try it out is also another cherry on the top of the game and I just wanted to release this video and then explain you why everybody's been talking about beating gpt4 and why it may not be entirely true if you have any questions let me know in the comment section but I'm definitely looking forward to try out this model and compare it with gbt4 and give you some insights see in another video happy prompting

Original Description

WizardCoder-Python-34B-V1.0 , which achieves the 73.2 pass@1 and surpasses GPT4 (2023/03/15), ChatGPT-3.5, and Claude2 Two Evals - https://twitter.com/WizardLM_AI/status/1695396881218859374?s=20 WizardCoder Python on HF Model Link - https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0 ❤️ If you want to support the channel ❤️ Support here: Patreon - https://www.patreon.com/1littlecoder/ Ko-Fi - https://ko-fi.com/1littlecoder

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from 1littlecoder · 1littlecoder · 0 of 60

← Previous Next →

How to create your Free Data Science Blog on Github with Fastpages from Fastai

How to create your Free Data Science Blog on Github with Fastpages from Fastai

Making Interactive Matplotlib Plots for Data Science Visualizations on Jupyter (Python)

Making Interactive Matplotlib Plots for Data Science Visualizations on Jupyter (Python)

Create your first Data Science Web App using R Shiny

Create your first Data Science Web App using R Shiny

How to create a Reproducible Example in R using reprex

How to create a Reproducible Example in R using reprex

No Code Visualization using esquisse with Tableau-like Drag and Drop GUI in R

No Code Visualization using esquisse with Tableau-like Drag and Drop GUI in R

Scrape HTML Table using rvest and Process them for insights using tidyverse in R

Scrape HTML Table using rvest and Process them for insights using tidyverse in R

Google Teachable Machine Learning Build No Code AI solution

Google Teachable Machine Learning Build No Code AI solution

Create meaningful fake tidy datasets in R using fakir [#rstats Package]

Create meaningful fake tidy datasets in R using fakir [#rstats Package]

How to enable using R Programming with Visual Studio VS Code

How to enable using R Programming with Visual Studio VS Code

Python, Community, Books - with Abhiram R - Bangpypers Co-organizers | 1littlecoder podcast

Python, Community, Books - with Abhiram R - Bangpypers Co-organizers | 1littlecoder podcast

Growing a Tech Community across India - Anubha Maneshwar, Founder Girlscript | 1littlecoder Podcast

Growing a Tech Community across India - Anubha Maneshwar, Founder Girlscript | 1littlecoder Podcast

Intro to Google Colab - How to use Colab

Intro to Google Colab - How to use Colab

Intro to Plotly Express - Complex Interactive Charts with One-Line of Python Code

Intro to Plotly Express - Complex Interactive Charts with One-Line of Python Code

Indic NLP Python Toolkit Open Source Development - iNLTK Creator Gaurav Arora | 1littlecoder Podcast

Indic NLP Python Toolkit Open Source Development - iNLTK Creator Gaurav Arora | 1littlecoder Podcast

Do you want a career in Data Science - Tamil Webinar

Do you want a career in Data Science - Tamil Webinar

Android Smartphone Analysis in R [Live Coding Screencast]

Android Smartphone Analysis in R [Live Coding Screencast]

Programmatically create Images, Memes, Watermarks using Python with imgmaker

Programmatically create Images, Memes, Watermarks using Python with imgmaker

Kaggle Walkthrough to get you started with Data Science - Webinar

Kaggle Walkthrough to get you started with Data Science - Webinar

Community, Corporate Job, Coding - Gnana Lakshmi T C aka Gyan, WomenWhoCode Leadership Fellow

Community, Corporate Job, Coding - Gnana Lakshmi T C aka Gyan, WomenWhoCode Leadership Fellow

Easy ggplot2 Theme Customization with {ggeasy} | Data Visualization in R

Easy ggplot2 Theme Customization with {ggeasy} | Data Visualization in R

Excel to R - Pivot + Bar Chart in Excel & R using tidyverse [Live Coding]

Excel to R - Pivot + Bar Chart in Excel & R using tidyverse [Live Coding]

Excel to R #2 - VLOOKUP in Excel to LEFT_JOIN, MERGE in R

Excel to R #2 - VLOOKUP in Excel to LEFT_JOIN, MERGE in R

5 websites to get Free Real-World Datasets for Data Science/ML Projects

5 websites to get Free Real-World Datasets for Data Science/ML Projects

Excel to R #3 - APPROXIMATE VLOOKUP in Excel to FUZZY LEFT_JOIN in R

Excel to R #3 - APPROXIMATE VLOOKUP in Excel to FUZZY LEFT_JOIN in R

Correlation-alternative PPS (Predictive Power Score) Python Package Demo

Correlation-alternative PPS (Predictive Power Score) Python Package Demo

Automated Website Screenshots in R using {webshot}

Automated Website Screenshots in R using {webshot}

Installing Custom RStudio Theme (Synthwave85)

Installing Custom RStudio Theme (Synthwave85)

Analyse Google Trends Search Data in R using {gtrendsR}

Analyse Google Trends Search Data in R using {gtrendsR}

3 Tips to ask question on Stack Overflow the right way to get answers

3 Tips to ask question on Stack Overflow the right way to get answers

Learn Data Science with R - Mini Projects - Web Scraping Zomato

Learn Data Science with R - Mini Projects - Web Scraping Zomato

Easily make Dumbbell Chart using {ggcharts} | Data Visualization in R

Easily make Dumbbell Chart using {ggcharts} | Data Visualization in R

GET Hackernews Front Page Results using REST API in R

GET Hackernews Front Page Results using REST API in R

Quickly deploy ML WebApps from Google Colab using ngrok

Quickly deploy ML WebApps from Google Colab using ngrok

Use Jupyter Notebooks within VSCode (Visual Studio Code) in 2020

Use Jupyter Notebooks within VSCode (Visual Studio Code) in 2020

Plotly Interactive Plots as Pandas Plotting Backend df.plot()

Plotly Interactive Plots as Pandas Plotting Backend df.plot()

Stack Overflow Developer Survey 2020 Highlights for New Programmers

Stack Overflow Developer Survey 2020 Highlights for New Programmers

Matplotlib Animation Charts in Python using Celluloid

Matplotlib Animation Charts in Python using Celluloid

Coding, Postwoman, Passion Project Book - Liyas Thomas Open Source Developer - 1littlecoder podcast

Coding, Postwoman, Passion Project Book - Liyas Thomas Open Source Developer - 1littlecoder podcast

Aspiring Data Scientist, Tips on How to learn Business Domain Knowledge

Aspiring Data Scientist, Tips on How to learn Business Domain Knowledge

Bokeh Interactive Charts as Pandas Plotting Backend df.plot_bokeh()

Bokeh Interactive Charts as Pandas Plotting Backend df.plot_bokeh()

Easy Fast Python Pandas Summary with Sidetable | Pandas Tips & Tricks

Easy Fast Python Pandas Summary with Sidetable | Pandas Tips & Tricks

Inception, Content Ideas, Consistency - Srivatsan Srinivasan AIEngineering YouTube Content Creator

Inception, Content Ideas, Consistency - Srivatsan Srinivasan AIEngineering YouTube Content Creator

ggplot2 Text Customization with ggtext | Data Visualization in R

ggplot2 Text Customization with ggtext | Data Visualization in R

Penguins Dataset Overview - iris alternative | EDA Data Visualization in R

Penguins Dataset Overview - iris alternative | EDA Data Visualization in R

YouTube Growth Tips, Content Creation - Bhavesh Bhatt, YouTuber (Data Science & Machine Learning) #7

YouTube Growth Tips, Content Creation - Bhavesh Bhatt, YouTuber (Data Science & Machine Learning) #7

Matplotlib Animated Bar Chart Race in Python | Data Visualization

Matplotlib Animated Bar Chart Race in Python | Data Visualization

Simple Python GUI Development using {guietta}

Simple Python GUI Development using {guietta}

#8 Niche, Growth, Monetization - David Langer - YouTuber Dave on Data

#8 Niche, Growth, Monetization - David Langer - YouTuber Dave on Data

Simple Fast 3-step Python OCR using Deep Learning 40+ Languages

Simple Fast 3-step Python OCR using Deep Learning 40+ Languages

Github New Feature Profile Summary/Mini-Resume - Profile Views

Otto ML Assistant, GPT-3 on Philosophers, Nvidia-ARM - 3 ML Tech News

Otto ML Assistant, GPT-3 on Philosophers, Nvidia-ARM - 3 ML Tech News

What is OpenAI GPT-3 - Hype, Examples, Worries

What is OpenAI GPT-3 - Hype, Examples, Worries

Julia 1.5, Datamuse API, Live HDR+ Pixel 4a - Machine Learning Tech News

Julia 1.5, Datamuse API, Live HDR+ Pixel 4a - Machine Learning Tech News

Self-driving Car Engineer sentenced, arXiv Dataset, AI/ML Startup Idea - Machine Learning Tech News

Self-driving Car Engineer sentenced, arXiv Dataset, AI/ML Startup Idea - Machine Learning Tech News

GPT-3 Explorer, Ciphey (Automated Decryption), Py-Sudoku - ML Tech News

GPT-3 Explorer, Ciphey (Automated Decryption), Py-Sudoku - ML Tech News

How to use Advanced Google Search to extract Email Ids from Linkedin

How to use Advanced Google Search to extract Email Ids from Linkedin

Cartoonizer Toon-IT (AI Web App), GPT-3 Advice, Android Earthquake Detection - ML Tech News

Cartoonizer Toon-IT (AI Web App), GPT-3 Advice, Android Earthquake Detection - ML Tech News

Flow - R Package to visualize code logic, functions as a Flow Diagram

Flow - R Package to visualize code logic, functions as a Flow Diagram

Build GPT-3-like Language Model on Google Colab with minGPT [PyTorch]

Build GPT-3-like Language Model on Google Colab with minGPT [PyTorch]

Create a Pencil Sketch Portrait with Python OpenCV

Create a Pencil Sketch Portrait with Python OpenCV

The WizardCoder Python 34B LLM model achieves a high score on human eval benchmarks, but its performance is nuanced. The model is fine-tuned based on Code Llama and is available for download on Hugging Face. This video explains the model's performance and its implications for the AI community.

Key Takeaways

Download the WizardCoder Python 34B LLM model from Hugging Face
Fine-tune the model for specific tasks
Evaluate the model's performance on human eval benchmarks
Compare the model's performance to other LLM models like GPT4

💡 The WizardCoder Python 34B LLM model's performance is nuanced and depends on the specific benchmark and API used. Fine-tuning and open-sourcing the model are key factors in its success.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related AI Lessons

Claude AI vs ChatGPT: Which One Is Actually Better in 2026?

Compare Claude AI and ChatGPT based on real-world usage and benchmarking to determine which one is better in 2026

Claude AI vs ChatGPT: Which One Is Actually Better in 2026?

Compare Claude AI and ChatGPT to determine which AI model is better for your needs in 2026

Medium · Programming

IntelliBooks: Classic RAG vs Graph RAG vs Agentic RAG – Choosing the Right AI Retrieval Architecture for Enterprise AI

Learn to choose the right AI retrieval architecture for enterprise AI between Classic RAG, Graph RAG, and Agentic RAG

Fluid, natural voice translation with Gemini 3.5 Live Translate

Learn about Gemini 3.5 Live Translate, a new voice translation technology that enables fluid and natural conversations across languages

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)