Sentiment Analysis with Transformers in Python

NeuralNine · Beginner ·🧠 Large Language Models ·2y ago

Key Takeaways

The video demonstrates sentiment analysis using HuggingFace transformers in Python, specifically with the DistilBERT model and a dataset from Kaggle.

Full Transcript

what is going on guys welcome back in this video today we're going to learn how to easily do sentiment analysis using a bird model from hugging face so let us get right into [Music] it all right so we're going to do sentiment analysis in Python today using a Transformer model from hugging face in particular we're going to use this one here distill bird base uncased fine-tuned sst2 and we're going to easily use it using the Transformers package in Python it comes with a tokenizer and a model and we can use it easily uh to classify text as either positive or negative regarding the sentiment now whatever data you want to use for this video is up to you you can use a kaggle data set you can use your own data you can use Amazon reviews movie reviews uh news articles or in my case here I'm going to use the top 20 Play Store app reviews and particular the Dropbox CV file just so I have some text that I can use um the important thing is that you have a list of texts that you want to analyze in terms of the sentiment once you have that we can open up a uh notebook here a IPython notebook so a Jupiter notebook basically but you can also do that in an ordinary python file and what we need to install are a couple of packages we need to install using pip first of all uh Transformers we need to install torch we need to install numpy and we need to install pandas these are the packages that we need for this video today so if you don't have them on your system make sure you install them um what we're going to start with is we're going to say import pandas as PD and we're going to load the data frame into our code so let me just zoom in a little bit here we're going to do a read CSV and the CSV is going to be the Dropbox CSV file in my case here and you can see we have review IDs we have some content and we have a score now the score can be used for comparison because five is usually very positive one is usually very negative so if they're uh the same most of the time so if you get a positive for the fives and a negative for the ones it seems to work quite well um and what we're going to do here now since we have a couple of rows I'm just going to limit this for um for demonstration purposes to a sample size of just 200 so we're going to say DF equals DF sample 200 and then we basically have the same structure but we only have uh 200 rows now let me maybe do that again so we have some yeah I mean it's it's fine okay um all right so this is our data frame now and what we're going to do to actually do the classification is we're going to say from Transformers import pipeline so we're going to model this as a pipeline and from Transformers we're going to import distill come on Where's the auto completion when I need it okay I'm going to type this myself then dis still BT [Music] tokenizer distill BT for sequence classification and then we're going to do the following we're going to say tokenizer equals and then this distill bird tokenizer uh Dot from pre-trained so we're going to load the pre-trained model from hugging phase uh which is going to be exactly um this thing here so you basically just have to copy this um and you can paste it here as a string I don't even think that we need this we can only we can just provide this I think and then it also works then we're going to do the same thing now for the model itself the model is going to be from pre-trained but it's not going to be the tokenizer it's going to be distill bird for sequence classification now the pipeline we're going to call it NLP and we're going to say it's a pipeline that has the name sentiment and now Anis and what we're going to say is that the model is the model and the tokenizer is the tokenizer like this and then all you have to do basically to do the classification is you have to get the list of texts so texts is going to be equal to DF uh. content. values and I think it should be a list a python list and then we just have to say results equal NLP applied to the text like this and then it's going to do some work probably with a GPU and once it's done we can go ahead and say for text result and score in zip and now we're going to zip the texts we're going to zip the results and we're going to zip the F score values uh just so we have a comparison and we're going to print this information so text is text and then we can copy this results and score there you go and what you can see here is for each comment for each of the 200 comments uh we have the text we have the result containing a label being positive and score which is the confidence as far as I know um and you can see for example let's go to an obvious one lose my old photos collection automatically password reset problem please Dropbox team help for recover my account okay that's not very good English but it's obviously a negative review really impressive I thought and so on it's going to be a positive one okay so to get a better overview maybe of this let's go ahead and add this as a row so let's take the label of the result and add it to the row uh to to each row in the data frame so so we're going to say DF sentiment is going to be equal to uh basically R label so this is a list comprehension for R in result our results plural here and that should actually be it there you go so you can see you get a negative review a positive review and uh most of the time it seems to be pretty accurate even if something like why can't I edit EXO files is it doesn't really contain any negative word per se so it doesn't say it's bad or horrible or unsatisfying it asks a question but somehow it recognizes that this is uh negative and it's also true so you can also see whatever this means menab uh is positive it also has a score of five cracked is negative so it seems to be very accurate when it comes to that so this is how easily you can do um sentiment analysis using transform former models from hugging face so that's it for today's video I hope you enjoyed it and hope you learned something if so let me know by hitting a like button and leaving a comment in the comment section down below and of course don't forget to subscribe to this Channel and hit the notification Bell to not miss a single future video for free other than that thank you much for watching see you in the next video and bye

Original Description

In this video we learn how to do sentiment analysis with HuggingFace transformers in Python. Model: https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english Dataset: https://www.kaggle.com/datasets/odins0n/top-20-play-store-app-reviews-daily-update?select=Dropbox.csv ◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾ 📚 Programming Books & Merch 📚 🐍 The Python Bible Book: https://www.neuralnine.com/books/ 💻 The Algorithm Bible Book: https://www.neuralnine.com/books/ 👕 Programming Merch: https://www.neuralnine.com/shop 💼 Services 💼 💻 Freelancing & Tutoring: https://www.neuralnine.com/services 🌐 Social Media & Contact 🌐 📱 Website: https://www.neuralnine.com/ 📷 Instagram: https://www.instagram.com/neuralnine 🐦 Twitter: https://twitter.com/neuralnine 🤵 LinkedIn: https://www.linkedin.com/company/neuralnine/ 📁 GitHub: https://github.com/NeuralNine 🎙 Discord: https://discord.gg/JU4xr8U3dm
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from NeuralNine · NeuralNine · 0 of 60

← Previous Next →
1 Visualizing Stock Data With Candlestick Charts in Python
Visualizing Stock Data With Candlestick Charts in Python
NeuralNine
2 Python Beginner Tutorial #1 - Installation and First Program
Python Beginner Tutorial #1 - Installation and First Program
NeuralNine
3 Python Beginner Tutorial #2 - Variables and Data Types
Python Beginner Tutorial #2 - Variables and Data Types
NeuralNine
4 Python Beginner Tutorial #3 - Operators and User Input
Python Beginner Tutorial #3 - Operators and User Input
NeuralNine
5 Python Beginner Tutorial #4 - If Statements and Conditions
Python Beginner Tutorial #4 - If Statements and Conditions
NeuralNine
6 Python Beginner Tutorial #5 - Loops
Python Beginner Tutorial #5 - Loops
NeuralNine
7 Python Beginner Tutorial #6 - Sequences and Collections
Python Beginner Tutorial #6 - Sequences and Collections
NeuralNine
8 Python Beginner Tutorial #7 - Functions
Python Beginner Tutorial #7 - Functions
NeuralNine
9 Python Beginner Tutorial #8 - Exception Handling
Python Beginner Tutorial #8 - Exception Handling
NeuralNine
10 Python Beginner Tutorial #9 - File Operations
Python Beginner Tutorial #9 - File Operations
NeuralNine
11 Python Beginner Tutorial #10 - String Functions
Python Beginner Tutorial #10 - String Functions
NeuralNine
12 Python Intermediate Tutorial #1 - Classes and Objects
Python Intermediate Tutorial #1 - Classes and Objects
NeuralNine
13 Python Intermediate Tutorial #2 - Inheritance
Python Intermediate Tutorial #2 - Inheritance
NeuralNine
14 Python Intermediate Tutorial #3 - Multithreading
Python Intermediate Tutorial #3 - Multithreading
NeuralNine
15 Python Intermediate Tutorial #4 - Synchronizing Threads
Python Intermediate Tutorial #4 - Synchronizing Threads
NeuralNine
16 Python Intermediate Tutorial #5 - Events and Daemon Threads
Python Intermediate Tutorial #5 - Events and Daemon Threads
NeuralNine
17 Python Intermediate Tutorial #6 - Queues
Python Intermediate Tutorial #6 - Queues
NeuralNine
18 Python Intermediate Tutorial #7 - Sockets and Network Programming
Python Intermediate Tutorial #7 - Sockets and Network Programming
NeuralNine
19 Python Intermediate Tutorial #8 - Database Programming
Python Intermediate Tutorial #8 - Database Programming
NeuralNine
20 Python Intermediate Tutorial #9 - Recursion
Python Intermediate Tutorial #9 - Recursion
NeuralNine
21 Python Intermediate Tutorial #10 - XML Processing
Python Intermediate Tutorial #10 - XML Processing
NeuralNine
22 Python Intermediate Tutorial #11 - Logging
Python Intermediate Tutorial #11 - Logging
NeuralNine
23 Python Data Science Tutorial #1 - Anaconda and PyCharm Setup
Python Data Science Tutorial #1 - Anaconda and PyCharm Setup
NeuralNine
24 Python Data Science Tutorial #2 - NumPy Arrays
Python Data Science Tutorial #2 - NumPy Arrays
NeuralNine
25 Python Data Science Tutorial #3 - Numpy Functions
Python Data Science Tutorial #3 - Numpy Functions
NeuralNine
26 Python Data Science Tutorial #4 - Plotting Functions With Matplotlib
Python Data Science Tutorial #4 - Plotting Functions With Matplotlib
NeuralNine
27 Python Data Science Tutorial #5 - Subplots and Multiple Windows
Python Data Science Tutorial #5 - Subplots and Multiple Windows
NeuralNine
28 Python Data Science Tutorial #6 - Matplotlib Styling
Python Data Science Tutorial #6 - Matplotlib Styling
NeuralNine
29 Python Data Science Tutorial #7 - Bar Charts with Matplotlib
Python Data Science Tutorial #7 - Bar Charts with Matplotlib
NeuralNine
30 Python Data Science Tutorial #8 - Pie Charts with Matplotlib
Python Data Science Tutorial #8 - Pie Charts with Matplotlib
NeuralNine
31 Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib
Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib
NeuralNine
32 Python Data Science Tutorial #10 - Scatter Plots with Matplotlib
Python Data Science Tutorial #10 - Scatter Plots with Matplotlib
NeuralNine
33 Python Data Science Tutorial #11 - 3D Plotting with Matplotlib
Python Data Science Tutorial #11 - 3D Plotting with Matplotlib
NeuralNine
34 Python Data Science Tutorial #12 - Pandas Series
Python Data Science Tutorial #12 - Pandas Series
NeuralNine
35 Python Data Science Tutorial #13 - Pandas Data Frames
Python Data Science Tutorial #13 - Pandas Data Frames
NeuralNine
36 Python Data Science Tutorial #14 - Pandas Statistics
Python Data Science Tutorial #14 - Pandas Statistics
NeuralNine
37 Python Data Science Tutorial #15 - Pandas Sorting and Functions
Python Data Science Tutorial #15 - Pandas Sorting and Functions
NeuralNine
38 Python Data Science Tutorial #16 - Pandas Merging Data Frames
Python Data Science Tutorial #16 - Pandas Merging Data Frames
NeuralNine
39 Python Data Science Tutorial #17 - Pandas Queries
Python Data Science Tutorial #17 - Pandas Queries
NeuralNine
40 Python Machine Learning Tutorial #1 - What is Machine Learning?
Python Machine Learning Tutorial #1 - What is Machine Learning?
NeuralNine
41 Python Machine Learning Tutorial #2 - Linear Regression
Python Machine Learning Tutorial #2 - Linear Regression
NeuralNine
42 Python Machine Learning Tutorial #3 - K-Nearest Neighbors Classification
Python Machine Learning Tutorial #3 - K-Nearest Neighbors Classification
NeuralNine
43 Python Machine Learning #4 - Support Vector Machines
Python Machine Learning #4 - Support Vector Machines
NeuralNine
44 Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification
Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification
NeuralNine
45 Python Machine Learning Tutorial #6 - K-Means Clustering
Python Machine Learning Tutorial #6 - K-Means Clustering
NeuralNine
46 Python Machine Learning Tutorial #7 - Neural Networks
Python Machine Learning Tutorial #7 - Neural Networks
NeuralNine
47 Python Machine Learning Tutorial #8 - Handwritten Digit Recognition with Tensorflow
Python Machine Learning Tutorial #8 - Handwritten Digit Recognition with Tensorflow
NeuralNine
48 Generating Poetic Texts with Recurrent Neural Networks in Python
Generating Poetic Texts with Recurrent Neural Networks in Python
NeuralNine
49 Stock Portfolio Visualization with Matplotlib in Python
Stock Portfolio Visualization with Matplotlib in Python
NeuralNine
50 Analyzing Coronavirus with Python (COVID-19)
Analyzing Coronavirus with Python (COVID-19)
NeuralNine
51 Making Text Images Readable Again with Python and OpenCV
Making Text Images Readable Again with Python and OpenCV
NeuralNine
52 Neural Networks Simply Explained (Theory)
Neural Networks Simply Explained (Theory)
NeuralNine
53 Motion Filtering with OpenCV in Python
Motion Filtering with OpenCV in Python
NeuralNine
54 Top 5 Programming Languages To Learn in 2020
Top 5 Programming Languages To Learn in 2020
NeuralNine
55 Simple TCP Chat Room in Python
Simple TCP Chat Room in Python
NeuralNine
56 Image Classification with Neural Networks in Python
Image Classification with Neural Networks in Python
NeuralNine
57 Edge Detection with OpenCV in Python
Edge Detection with OpenCV in Python
NeuralNine
58 S&P 500 Web Scraping with Python
S&P 500 Web Scraping with Python
NeuralNine
59 Simple Sentiment Text Analysis in Python
Simple Sentiment Text Analysis in Python
NeuralNine
60 Introduction - Algorithms & Data Structures #1
Introduction - Algorithms & Data Structures #1
NeuralNine

This video teaches how to perform sentiment analysis using HuggingFace transformers in Python, covering the basics of transformers, loading pre-trained models, and fine-tuning them for specific tasks. The video uses a dataset from Kaggle and the DistilBERT model, providing a practical example of how to apply these concepts. By following this video, viewers can learn how to build and deploy their own sentiment analysis models.

Key Takeaways
  1. Install the HuggingFace transformers library
  2. Load the pre-trained DistilBERT model
  3. Prepare the dataset from Kaggle
  4. Fine-tune the model for sentiment analysis
  5. Evaluate the model's performance
  6. Deploy the model for use in applications
💡 Using pre-trained models like DistilBERT can significantly improve the performance of sentiment analysis tasks, and fine-tuning these models for specific datasets can further enhance their accuracy.

Related AI Lessons

Building HITL Feedback RAG: Embeddings, Retrieval, and Reranking
Learn to build a Human-in-the-Loop (HITL) Feedback RAG system using embeddings, retrieval, and reranking to improve model performance
Medium · AI
Building HITL Feedback RAG: Embeddings, Retrieval, and Reranking
Learn to build a Human-in-the-Loop (HITL) Feedback RAG system using embeddings, retrieval, and reranking to improve LLM performance
Medium · LLM
The 2026 AI Model Release Race: Every Major LLM Launch You Need to Know
Stay updated on the 2026 AI model release race, including major LLM launches like Claude Sonnet 5 and GPT-5.6, to leverage the latest advancements in AI technology
Dev.to AI
Call GPT, Claude, and Gemini from one API key — a 3-step setup
Access GPT, Claude, and Gemini through one API key with a 3-step setup using Modelishub
Dev.to AI
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →