Code your own YouTube AI assistant in Python
Key Takeaways
This video demonstrates how to build a Python workflow to extract knowledge from YouTube videos using AssemblyAI's LeMU and Anthropic's Claude 3.5 Sonnet large language model.
Full Transcript
in this video we're going to build a python workflow to answer questions from YouTube videos automatically why is this useful you might ask well firstly you'll be able to save time by finding key information from the video number two you'll be able to learn efficiently by quickly grasping or extracting the main points from the video's content thirdly you'll be able to boost your productivity by automating the process of content research which typically take days or weeks in the order to perform and so you still see that this project condenses Cutting Edge AI into a few lines of python code and so without further Ado let's dive in all right so here is the question answering of YouTube video using assembly ai's lammer model and you can follow along in this Jupiter notebook and the links to this will be provided in the video description so we're going to use a assembly AI for performing the processing and analyzing of audio data and the documentation will typically be consulted during the building of this python workflow and so the schematic of what we're building today can be summarized in this illustration so essentially we're going to take a YouTube video where we're going to provide the URL of the video and then it's going to download the audio file and we're doing that using the YT DLP python library and once we have the audio file we're going to read it in using assembly AI which would then convert the audio file into a text transcript file and then the transcript file here will then be used as an input to the large language model and so this will be packaged as the lmer model by assembly Ai and then we're going to Tak in the question prompt as an input and then we're going to generate the output to be the answer to the question being asked and under the hood we're using the cloud 3.5 sonnet and so I think we're ready to begin so firstly you want to go to sign up for an account as I have already signed up I'll be able to access the API key so I just click on copy API key right here and then in the collab notebook you'll be able to put in all of your API keys in the secrets management here so if you click on it it will then be expanded here so you're going to see that I have all of my API Keys conveniently accessible here on the collab so I'm going to activate the API key for the assembly Ai and then instructions for using the API key will be described here so let's begin let's install the prerequisite libraries so here we're going to install the YT DLP which allows you to download the YouTube audio file and then we're also going to install assembly AI so you might notice that in Prior videos I have already generated tutorial videos on using assembly AI for transcribing audio files and so before it was an API access to the assembly AI platform but for this tutorial we're doing that using the python library from assembly Ai and so here we're going to load in the API key into a. settings. API key and that will allow us to access the model so next we're going to import the YT DLP module we're going to define a custom function that will allow us to download an MP3 audio file and so let's do that and then as input we're going to put in the URL of the YouTube video so here let me show you is a YouTube video of Steve Jobs Stanford commencement address in 2005 so the video is 15 minutes long so we're going to put put the URL in and then we're going to run this custom function which will allow us to download the audio and so we're is saving it locally here and let's have a look after a short moment it's downloading all right it's finished and let's have a look here in the directory so this is the audio file it's an MP3 file 20.7 megabits let's proceed to extracting the video title so that's the video title firstly we're going to generate the video title text and then we're going to use that as an input and then you'll be able to hear the audio that was downloaded directly in the collab so please note that this is for educational purpose [Music] only okay and so you're going to see that it works okay and now we're going to proceed to processing and analyzing the audio so in order to perform the question and answering of the YouTube video first we're going to transcribe the audio file meaning that we're going to take the MP3 audio file here and then we're going to convert it into text format which is the transcript and then we're going to do that using the transcribe method from assembly AI python library and then we're saving it as the transcriber variable and then we're using that together we did transcribe method in order to generate the transcript and then this transcript along with the prompt let's run it first so the transcript text file along with the prompt would then be used as input to the lmer model let's have a look what it looks like the transcript here it's probably a object yep so it's an object and then the prompt that we're going to use which is the question prompt is what are the five key messages that Steve Jobs wanted to convey in the speech and so so these two will be used as inputs so here we're going to use the lmer task method on the transcript object that we have just created a few moments ago and so we're going to use the prompt question and also the transcript as the input here let's run it and in a few seconds it should be able to generate the result and so the result will be the answer so there are other parameters that you could also try it out like the max output size which is relative the length of the output response and also you could play around with the temperature which allows the large language model to be creative in generating the response output so let's have a look at the result so it's spitting out this after the output you'll be able to see the number of input tokens that have been used the number of output tokens used 275 and the input is 2956 which is for the 15 Minutes video and now we're going to print the response so is result. response and so these are the recommendations that Steve Jobs has given in his video so the five key messages are connect the dots love what you do learn from setbacks live each day as if it were your last follow your heart and intuition and then he closed the video by saying stay hungry stay foolish and yeah so that's a pretty good summary of the video and you'll be able to see that in only a few seconds you'll be able to get the grasp of the contents of the video and so imagine that you have let's say more than one video 10 video 100 videos that you're going to use as a starting point for your research you could essentially compile and harness this very simple workflow to help you out with your research so you could compile hundreds of videos and then you'll be able to consolidate all of the Lessons Learned into a single Corpus of text so this code cell will allow you to more or less format the response so let's have a look at the output again so I'm just going to copy this and then we're going to print it below and it should wrap the word there you go you don't have to scroll left or scroll right the entire text will be conveniently word wrapped instead of this being on you know the same line but then you have to scroll left right and let's say that you want to delete the generated response from the assembly AI server you could do that by using the purge request data method just run it and then you'll be able to delete it from the server let's have a look at other models that you could try out so currently you're going to see that it has basic Cloud 2 Cloud 35 Cloud 3 and then the one that we've used is the 35 Sonet and there's also Mistral 7B as well and so you're going to see here that in only few lines of code you could generate the response output which is the key messages from the Steve Jobs video let's say that we want to have another prompt so we're going to write a short blog of 500 Words and we're going to use that as the input and let's see let have a look at the output tokens 659 so it's much more than the previous one let's have it look at the blog all right so it's more or less expanding the key messages so that's the title of the blog that's the introductory paragraph and then here are some of the paragraphs on connecting the dots loving what you do and it also goes to summarize the key messages here along with including paragraph So this is pretty cool and all of the references that you'll be able to use if you have any questions it's provided here this is the link to the lmer model here's the specific page on asking questions about your audio data and there's also the processing audio files and if you like this type of video please check out the data Professor YouTube channel so click here to go to the data Professor YouTube channel and so as you can see in only a few lines of code and also a very simple workflow you'll be able to go from a YouTube video URL to audio file to transcript then to the generated response output answer by providing a simple Quest prompt and so this is the beginning you could think of it as a starter code for you to generate something much more complicated and so let me know in the comment section down below how you're going to build out your very own workflow and so this Jupiter notebook is provided in the video description so thanks for watching until the end of the video if you watch this far please drop a fire Emoji so that we know that you're the real one and as always remember to hit that subscribe button turn on notifications and also share with your friends and as always the best way to learn data science or AI is to do data science or AI
Original Description
In this video, we're building a Python workflow that helps you extract knowledge from any YouTube video.
In a nutshell, the general workflow includes:
1. Extracting and downloading the audio from a YouTube video
2. Transcribing the audio into text form
3. Answer questions about the video using LeMUR from AssemblyAI, where under the hood Anthropic's Claude 3.5 Sonnet is used as the large language model.
AssemblyAI has generously provided API credits for the tutorial and has agreed to provide $50 free credits to viewers of this video:
🔑 Get your AssemblyAI API key https://www.assemblyai.com/?utm_source=youtube&utm_medium=influencer&utm_campaign=dataprofessor_aug24
📖 AssemblyAI Docs: https://www.assemblyai.com/docs/?utm_source=youtube&utm_medium=influencer&utm_campaign=dataprofessor_aug24
🐙 Code https://github.com/dataprofessor/assemblyai/
✨ Read Blog https://dataprofessor.beehiiv.com/p/i-coded-a-youtube-ai-assistant-that-boosted-my-productivity
----------------------------
Support my work:
👪 Join as Channel Member:
https://www.youtube.com/channel/UCV8e2g4IWQqK71bbzGDEI4Q/join
✉️ Newsletter http://newsletter.dataprofessor.org
📖 Join Medium to Read my Blogs https://data-professor.medium.com/membership
☕ Buy me a coffee https://www.buymeacoffee.com/dataprofessor
Recommended Resources
📚 Books https://kit.co/dataprofessor
😎 Taro (Tech Career Mentorship) https://www.jointaro.com/r/dataprofessor/
📜 Google Data Analytics Professional Certificate https://click.linksynergy.com/deeplink?id=PNeWWakF7rI&mid=40328&murl=https%3A%2F%2Fwww.coursera.org%2Fprofessional-certificates%2Fgoogle-data-analytics
🤔 Interview Query https://www.interviewquery.com/?ref=dataprofessor
🖥️ Stock photos, graphics and videos used on this channel https://1.envato.market/c/2346717/628379/4662
Subscribe:
🌟 Coding Professor https://www.youtube.com/channel/UCJzlfIoF8nmWqJIv_iWQVRw?sub_confirmation=1
🌟 Data Professor https://www.youtube.com/dataprofessor?sub_confirmation=1
Disclaimer:
Recomm
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Data Professor · Data Professor · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
How a Biologist became a Data Scientist
Data Professor
WEKA Tutorial #1.1 - How to Build a Data Mining Model from Scratch
Data Professor
WEKA Tutorial #1.2 - How to Build a Data Mining Model from Scratch
Data Professor
WEKA Tutorial #1.3 - How to Build a Data Mining Model from Scratch
Data Professor
Computational Drug Discovery: Machine Learning for Making Sense of Big Data in Drug Discovery
Data Professor
Quotes #1 on Big Data and Data Science
Data Professor
Quotes #2 on Big Data and Data Science
Data Professor
Quotes #3 on Big Data and Data Science
Data Professor
Quotes #4 on Big Data and Data Science
Data Professor
Quotes #5 on Big Data and Data Science
Data Professor
Data Science 101: Starting a Data Science / Data Mining Project
Data Professor
Data Science 101: CRISP-DM - Data Mining / Data Science in 6 Steps
Data Professor
R Programming 101: How to Define Variables
Data Professor
R Programming 101: Read and Write CSV files
Data Professor
Data Science 101: Basic Command-Line for Data Science
Data Professor
Strategies for Learning Data Science in 2020 (Data Science 101)
Data Professor
Building your Data Science Portfolio with GitHub (Data Science 101)
Data Professor
R Programming 101: Setting up R programming environment (R, RStudio and RStudio.cloud)
Data Professor
Exploratory Data Analysis in R: Towards Data Understanding
Data Professor
Exploratory Data Analysis in R: Quick Dive into Data Visualization
Data Professor
Machine Learning in R: Building a Classification Model
Data Professor
Machine Learning in R: Repurpose Machine Learning Code for New Data
Data Professor
Data Science 101: Deploying your Machine Learning Model
Data Professor
Machine Learning in R: Deploy Machine Learning Model using RDS
Data Professor
Data Pre-processing in R: Handling Missing Data
Data Professor
Machine Learning in R: Speed up Model Building with Parallel Computing
Data Professor
Data Science 101: Overview of Machine Learning Model Building Process
Data Professor
Web Apps in R: Building your First Web Application in R | Shiny Tutorial Ep 1
Data Professor
Web Apps in R: Build Interactive Histogram Web Application in R | Shiny Tutorial Ep 2
Data Professor
Web Apps in R: Building Data-Driven Web Application in R | Shiny Tutorial Ep 3
Data Professor
Web Apps in R: Building the Machine Learning Web Application in R | Shiny Tutorial Ep 4
Data Professor
Web Apps in R: Build BMI Calculator web application in R for health monitoring | Shiny Tutorial Ep 5
Data Professor
Machine Learning in R: Building a Linear Regression Model
Data Professor
What programming language to learn for Data Science? R versus Python
Data Professor
How to Become a Data Scientist (Learning Path and Skill Sets Needed)
Data Professor
Using Python in R
Data Professor
Interpretable Machine Learning Models
Data Professor
Making Scatter Plots in R [Data Visualisation in R series]
Data Professor
Machine Learning in Python: Building a Classification Model
Data Professor
Compare Machine Learning Classifiers in Python
Data Professor
Hyperparameter Tuning of Machine Learning Model in Python
Data Professor
Practical Introduction to Google Colab for Data Science
Data Professor
File Handling in Google Colab for Data Science
Data Professor
Pandas for Data Science: Create and Combine DataFrames / Rename Columns
Data Professor
Machine Learning in Python: Building a Linear Regression Model
Data Professor
Machine Learning in Python: Principal Component Analysis (PCA) for Handling High-Dimensional Data
Data Professor
How to Plot an ROC Curve in Python | Machine Learning in Python
Data Professor
Installing conda on Google Colab for Data Science
Data Professor
Use native R on Google Colab for Data Science
Data Professor
How to Save and Download files from Google Colab
Data Professor
Easy Web Scraping in Python using Pandas for Data Science
Data Professor
Data Science for Computational Drug Discovery using Python (Part 1)
Data Professor
Pandas Profiling for Data Science (Quick and Easy Exploratory Data Analysis)
Data Professor
Exploratory Data Analysis in Python using pandas
Data Professor
Quick tour of PyCaret (a low-code machine learning library in Python)
Data Professor
How to Upload Files to Google Colab
Data Professor
How to Install and Use Pandas Profiling on Google Colab
Data Professor
How to Adjust the Style of Pandas DataFrame
Data Professor
How to use Bamboolib for Data Wrangling in Data Science
Data Professor
How to use Pandas Profiling on Kaggle
Data Professor
More on: LLM Foundations
View skill →
🎓
Tutor Explanation
DeepCamp AI