Sentiment Analysis with Transformers in Python
Key Takeaways
The video demonstrates sentiment analysis using HuggingFace transformers in Python, specifically with the DistilBERT model and a dataset from Kaggle.
Full Transcript
what is going on guys welcome back in this video today we're going to learn how to easily do sentiment analysis using a bird model from hugging face so let us get right into [Music] it all right so we're going to do sentiment analysis in Python today using a Transformer model from hugging face in particular we're going to use this one here distill bird base uncased fine-tuned sst2 and we're going to easily use it using the Transformers package in Python it comes with a tokenizer and a model and we can use it easily uh to classify text as either positive or negative regarding the sentiment now whatever data you want to use for this video is up to you you can use a kaggle data set you can use your own data you can use Amazon reviews movie reviews uh news articles or in my case here I'm going to use the top 20 Play Store app reviews and particular the Dropbox CV file just so I have some text that I can use um the important thing is that you have a list of texts that you want to analyze in terms of the sentiment once you have that we can open up a uh notebook here a IPython notebook so a Jupiter notebook basically but you can also do that in an ordinary python file and what we need to install are a couple of packages we need to install using pip first of all uh Transformers we need to install torch we need to install numpy and we need to install pandas these are the packages that we need for this video today so if you don't have them on your system make sure you install them um what we're going to start with is we're going to say import pandas as PD and we're going to load the data frame into our code so let me just zoom in a little bit here we're going to do a read CSV and the CSV is going to be the Dropbox CSV file in my case here and you can see we have review IDs we have some content and we have a score now the score can be used for comparison because five is usually very positive one is usually very negative so if they're uh the same most of the time so if you get a positive for the fives and a negative for the ones it seems to work quite well um and what we're going to do here now since we have a couple of rows I'm just going to limit this for um for demonstration purposes to a sample size of just 200 so we're going to say DF equals DF sample 200 and then we basically have the same structure but we only have uh 200 rows now let me maybe do that again so we have some yeah I mean it's it's fine okay um all right so this is our data frame now and what we're going to do to actually do the classification is we're going to say from Transformers import pipeline so we're going to model this as a pipeline and from Transformers we're going to import distill come on Where's the auto completion when I need it okay I'm going to type this myself then dis still BT [Music] tokenizer distill BT for sequence classification and then we're going to do the following we're going to say tokenizer equals and then this distill bird tokenizer uh Dot from pre-trained so we're going to load the pre-trained model from hugging phase uh which is going to be exactly um this thing here so you basically just have to copy this um and you can paste it here as a string I don't even think that we need this we can only we can just provide this I think and then it also works then we're going to do the same thing now for the model itself the model is going to be from pre-trained but it's not going to be the tokenizer it's going to be distill bird for sequence classification now the pipeline we're going to call it NLP and we're going to say it's a pipeline that has the name sentiment and now Anis and what we're going to say is that the model is the model and the tokenizer is the tokenizer like this and then all you have to do basically to do the classification is you have to get the list of texts so texts is going to be equal to DF uh. content. values and I think it should be a list a python list and then we just have to say results equal NLP applied to the text like this and then it's going to do some work probably with a GPU and once it's done we can go ahead and say for text result and score in zip and now we're going to zip the texts we're going to zip the results and we're going to zip the F score values uh just so we have a comparison and we're going to print this information so text is text and then we can copy this results and score there you go and what you can see here is for each comment for each of the 200 comments uh we have the text we have the result containing a label being positive and score which is the confidence as far as I know um and you can see for example let's go to an obvious one lose my old photos collection automatically password reset problem please Dropbox team help for recover my account okay that's not very good English but it's obviously a negative review really impressive I thought and so on it's going to be a positive one okay so to get a better overview maybe of this let's go ahead and add this as a row so let's take the label of the result and add it to the row uh to to each row in the data frame so so we're going to say DF sentiment is going to be equal to uh basically R label so this is a list comprehension for R in result our results plural here and that should actually be it there you go so you can see you get a negative review a positive review and uh most of the time it seems to be pretty accurate even if something like why can't I edit EXO files is it doesn't really contain any negative word per se so it doesn't say it's bad or horrible or unsatisfying it asks a question but somehow it recognizes that this is uh negative and it's also true so you can also see whatever this means menab uh is positive it also has a score of five cracked is negative so it seems to be very accurate when it comes to that so this is how easily you can do um sentiment analysis using transform former models from hugging face so that's it for today's video I hope you enjoyed it and hope you learned something if so let me know by hitting a like button and leaving a comment in the comment section down below and of course don't forget to subscribe to this Channel and hit the notification Bell to not miss a single future video for free other than that thank you much for watching see you in the next video and bye
Original Description
In this video we learn how to do sentiment analysis with HuggingFace transformers in Python.
Model: https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english
Dataset: https://www.kaggle.com/datasets/odins0n/top-20-play-store-app-reviews-daily-update?select=Dropbox.csv
◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾
📚 Programming Books & Merch 📚
🐍 The Python Bible Book: https://www.neuralnine.com/books/
💻 The Algorithm Bible Book: https://www.neuralnine.com/books/
👕 Programming Merch: https://www.neuralnine.com/shop
💼 Services 💼
💻 Freelancing & Tutoring: https://www.neuralnine.com/services
🌐 Social Media & Contact 🌐
📱 Website: https://www.neuralnine.com/
📷 Instagram: https://www.instagram.com/neuralnine
🐦 Twitter: https://twitter.com/neuralnine
🤵 LinkedIn: https://www.linkedin.com/company/neuralnine/
📁 GitHub: https://github.com/NeuralNine
🎙 Discord: https://discord.gg/JU4xr8U3dm
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from NeuralNine · NeuralNine · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Visualizing Stock Data With Candlestick Charts in Python
NeuralNine
Python Beginner Tutorial #1 - Installation and First Program
NeuralNine
Python Beginner Tutorial #2 - Variables and Data Types
NeuralNine
Python Beginner Tutorial #3 - Operators and User Input
NeuralNine
Python Beginner Tutorial #4 - If Statements and Conditions
NeuralNine
Python Beginner Tutorial #5 - Loops
NeuralNine
Python Beginner Tutorial #6 - Sequences and Collections
NeuralNine
Python Beginner Tutorial #7 - Functions
NeuralNine
Python Beginner Tutorial #8 - Exception Handling
NeuralNine
Python Beginner Tutorial #9 - File Operations
NeuralNine
Python Beginner Tutorial #10 - String Functions
NeuralNine
Python Intermediate Tutorial #1 - Classes and Objects
NeuralNine
Python Intermediate Tutorial #2 - Inheritance
NeuralNine
Python Intermediate Tutorial #3 - Multithreading
NeuralNine
Python Intermediate Tutorial #4 - Synchronizing Threads
NeuralNine
Python Intermediate Tutorial #5 - Events and Daemon Threads
NeuralNine
Python Intermediate Tutorial #6 - Queues
NeuralNine
Python Intermediate Tutorial #7 - Sockets and Network Programming
NeuralNine
Python Intermediate Tutorial #8 - Database Programming
NeuralNine
Python Intermediate Tutorial #9 - Recursion
NeuralNine
Python Intermediate Tutorial #10 - XML Processing
NeuralNine
Python Intermediate Tutorial #11 - Logging
NeuralNine
Python Data Science Tutorial #1 - Anaconda and PyCharm Setup
NeuralNine
Python Data Science Tutorial #2 - NumPy Arrays
NeuralNine
Python Data Science Tutorial #3 - Numpy Functions
NeuralNine
Python Data Science Tutorial #4 - Plotting Functions With Matplotlib
NeuralNine
Python Data Science Tutorial #5 - Subplots and Multiple Windows
NeuralNine
Python Data Science Tutorial #6 - Matplotlib Styling
NeuralNine
Python Data Science Tutorial #7 - Bar Charts with Matplotlib
NeuralNine
Python Data Science Tutorial #8 - Pie Charts with Matplotlib
NeuralNine
Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib
NeuralNine
Python Data Science Tutorial #10 - Scatter Plots with Matplotlib
NeuralNine
Python Data Science Tutorial #11 - 3D Plotting with Matplotlib
NeuralNine
Python Data Science Tutorial #12 - Pandas Series
NeuralNine
Python Data Science Tutorial #13 - Pandas Data Frames
NeuralNine
Python Data Science Tutorial #14 - Pandas Statistics
NeuralNine
Python Data Science Tutorial #15 - Pandas Sorting and Functions
NeuralNine
Python Data Science Tutorial #16 - Pandas Merging Data Frames
NeuralNine
Python Data Science Tutorial #17 - Pandas Queries
NeuralNine
Python Machine Learning Tutorial #1 - What is Machine Learning?
NeuralNine
Python Machine Learning Tutorial #2 - Linear Regression
NeuralNine
Python Machine Learning Tutorial #3 - K-Nearest Neighbors Classification
NeuralNine
Python Machine Learning #4 - Support Vector Machines
NeuralNine
Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification
NeuralNine
Python Machine Learning Tutorial #6 - K-Means Clustering
NeuralNine
Python Machine Learning Tutorial #7 - Neural Networks
NeuralNine
Python Machine Learning Tutorial #8 - Handwritten Digit Recognition with Tensorflow
NeuralNine
Generating Poetic Texts with Recurrent Neural Networks in Python
NeuralNine
Stock Portfolio Visualization with Matplotlib in Python
NeuralNine
Analyzing Coronavirus with Python (COVID-19)
NeuralNine
Making Text Images Readable Again with Python and OpenCV
NeuralNine
Neural Networks Simply Explained (Theory)
NeuralNine
Motion Filtering with OpenCV in Python
NeuralNine
Top 5 Programming Languages To Learn in 2020
NeuralNine
Simple TCP Chat Room in Python
NeuralNine
Image Classification with Neural Networks in Python
NeuralNine
Edge Detection with OpenCV in Python
NeuralNine
S&P 500 Web Scraping with Python
NeuralNine
Simple Sentiment Text Analysis in Python
NeuralNine
Introduction - Algorithms & Data Structures #1
NeuralNine
More on: LLM Foundations
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Building HITL Feedback RAG: Embeddings, Retrieval, and Reranking
Medium · AI
Building HITL Feedback RAG: Embeddings, Retrieval, and Reranking
Medium · LLM
The 2026 AI Model Release Race: Every Major LLM Launch You Need to Know
Dev.to AI
Call GPT, Claude, and Gemini from one API key — a 3-step setup
Dev.to AI
🎓
Tutor Explanation
DeepCamp AI