Chat Bot With PyTorch - NLP And Deep Learning - Python Tutorial (Part 2)

Patrick Loeber · Beginner ·🧬 Deep Learning ·6y ago

Skills: LLM Foundations80%Prompt Craft60%

Key Takeaways

This video tutorial demonstrates how to build a simple chatbot using PyTorch and Deep Learning, covering basic Natural Language Processing (NLP) techniques such as tokenization, stemming, and bag of words. The tutorial utilizes tools like NLTK, PyTorch, and NumPy to create a chatbot model and train it using a dataset.

Full Transcript

hey guys welcome back to the second part of our chat bot tutorial so in this part we are going to create the actual training data so we already created the NL TK utils to tokenize and to stem our words so let's continue and apply this to our data that we have so we will create a new file and let's call this train dot pi and here we want to load our JSON file so we need Jason the Jason module and then we can say with open and then it's called intense dot jason in write mode or no read mode sorry SF we want to say our intense equals jason dot load F and then we can print our intense so let's clear this and let's run the training file to see if we have this load it so this is working and then we want to create our training data as I showed you in the first part so we want to apply tokenization then lowering and stemming then we also exclude the punctuation characters and then we apply the back of words and for this we need to collect all of the words so let's do this so let's create empty arrays first or empty lists so we say all words equals an empty list and we also want to collect all the different patterns and also know which different they have so we create an empty list for the text and we also create an empty list which we call XY which will later hold both our patterns and then the text so now we want to loop over our intent so we say for intent in intense and this is as we can see a Jason or now it's a dictionary a Python object and in the very beginning we have the intense key and then we only have one array with all the different texts and parents and responses so we say for intent in intense with the key intense and then we get the tack by saying business intent dot or with the key tag as in the Chasen files or the tack key and we will append this to our tax array so we say tax dot append our tag and then we want to loop over all the different patterns so this again is an array with the different patterns so we loop over this we say for pattern in intense with the key patterns and then we have this pattern and then we want to apply tokenization so we already implemented a utility function in the last part so we simply have to import this so we say from NLT kay utils import tokenize and let's already import the stemming function and the back of works function so now what we want to do is we want to tokenize our patterns so we say W equals tokenize the pattern then we want to put this into the all words array so we say all words and then dot extend and then W so we are you not using a pen but extent because this again is an array and we don't want to put an array of arrays here so we want to extend this here and then we will also put in the pattern or the tokenized pattern and the corresponding label to our XY list so we say XY append and then here is a tuple we use W and the tag so this will then know the pattern and the corresponding tag and then we are done with collecting these so now if you go back in our pipeline after tokenization we also want to lower and stem the words and exclude punctuation characters so let's do this so let's define some ignore words equals and then here for example we don't want a question mark or a exclamation mark or a dot or let's also don't use a comma and then we apply list comprehensions let's simply print all words to see if this is working so now if I clear this and then run this and then we see we still have an arrow here so for pattern in intent intent on oh yeah so now we only have to use intent for each single intent and then the pattern so let's run this again and then we see we get all the different words so they have been tokenized and now let's apply stemming so let's say our all words equals again a list and then we stem each word for W in all words and we also want to exclude the ignore word so we can very easily do this with list comprehension too so we say if W not in ignore words and now let's clear this and run this again to see if this is working so um I still have this from the last part so we don't need this anymore so now here we see we have all the words in lower cases and for example here we see that the ending got chopped off so stemming works too and now let's sort these words so let's say all words equals sort it and then we also only want unique words so we can simply convert this to a set so this is a nice little trick to remove duplicate elements and then the swords function will return a list again so now let's do this with the tax too so let's say our tax equals sort it and then a set from the tax so now this will have unique labels I don't think that this is necessary but it's better to do it so let's print let's print the tax here to see if this is working so let's run this and then we see we have all the different tax so the livery funny goodbye greeting items payments and thanks so now what we want to do is to create the training data so for this we want to continue in our pipeline and now create the bag of words so let's create a list with our X train data so let's say this is an empty list and then the Y train equals an empty list so this will be the tax or the associated number for each tag and in the X we put all the bag of words so we will loop over our X Y array that we have here so we say for and then we can unpack this tuple here so we put a tuple here with the pattern and the tax so we say for pattern or let's call this pattern sentence and tack in X Y and now what we want to do is we want to create a of words by calling the function bag of words and we can see we already implemented the definition so this will get the tokenized sentence so this is exactly the pattern sentence which is already tokenized so here we applied tokenization and then it needs the all words and then we append this to our training data so X train append the back so we still have to implement this function then and for the Y data so this will be our labels so for this we use the tags and then text dot index tag so for example that we print the delivery that we print the tags yes here we still have the text and now if the texts are in this order and we have to tack delivery then this will give us the label 0 and for funny this will give us the label 1 and so on so we have numbers for our labels and then we put this to our Y train so Y train append label so here we have to be careful sometimes you also want this as a so-called one hot encoded vector but in this case we are using PI torch and later we are going to use the cross entropy loss and this doesn't want it as one hot so here we only want to have the class labels so it's called cross entropy loss which we will see in the third part so that's why we don't have to care about one hot encoding here so only and put in the label for this pattern and then after this we want to convert this to a numpy array so we import numpy s in P and then we say our X train equals a number array based on this extra enlist and the same with y train equals numpy array y train so now we have the training data and now we still have to implement the bag of words function so we didn't do this in the last part so let's do this now so here let me copy and paste an example for you again so now what we have to do is we have our tokenize sentence with our new and incoming word so hello how are you and then we have the all words so here we already collected all the words based on the patterns that we looked up here so this is just a small example and then we look at each word in the sentence and if it is available in the words array then we put a 1 here so for example we have a hello so we put a 1 at the position where hello is we also have you here so we put a 1 at the position where u is and we don't have our and we don't have how in this example so all the rest of the positions will be 0 so this is how the bag of words is working so now let's do this so this will get a tokenized sentence and in the training pipeline we will also apply the stemming for the all words array so let's do the same for the tokenized sentence so let's do this and let me close this here so now we want to call the stammer for each word in the tokenize sentence so we use list comprehension again and say tokenize sentence equals the stemming function of our word W for W in tokenized sentence and now we applied the stemming and then we create and a back and initialize it with zero for each word so like this we have all words and then we create an array with the same size but only with zero so we can do this with numpy so we need number here too so import numpy s and P and then we say our back equals then we say numpy syros with the size of the length of the words or it's called all words here and then let's also define a data type this should be numpy float32 and then we loop over our all words so we say for index and word in and numerate all words so this will give us both the index and the current word and then we check if this word is in our tokenized sentence so then it will get a 1 so we will say with this index equals two 1s a float and then we will return the back so let's try this out so let's say our sentence equals this one so this is already tokenized then our all words are these words and then our bag of words equals the bag of words function with the sentence first and the words first and the word second and then we print the bag of words and now let's clear this and run Python NL TK yuto's and then we see we get the same array as I showed you here so this is working so let's remove this again and then we are done with this file so let's head back to our training file and now as a last thing in this part I want to create a PI torch data set from this training data so let's import some things that we need for pie charts so we import torch we import torch dot and n s and N and we say from torch dot utils dot data we import data sets and data loader so if you haven't installed PI torch already and don't know what these are then please have a look at my beginner course because there I will explain all of these things so now down here let's create a new data set so we have to create a class and call this chat data set and then this must inherit data set and we have to implement the init function which will only get self and here we will store self dot number of samples equals this is the length of X train then we will store the data so we say self dot X data equals just our X train array and self dot Y data equals our Y training array and then we also have to implement the get item function with self and the index and here so this is that we can later access data set with an index and then we can say here we return self dot X data of this index and self dot Y data of this index as a tuple and then we also define a or the length method with self and here we simply return self dot number of samples so now we have our chat data set so let's create this so let's say data set equals chat data set and then we also want to create a data loader from this so we say our training train loader equals data loader and then as a data set it gets this data set then say batch size equals batch size so let's create or define some hyper parameters here so oh I have to put this here so let's say hyperparameters and then here we say batch size equals let's say eight in this example and then we use this year we also say shuffle equals true for our training and in my case I say number of workers equals two so this is just for multi threading or multi processing you also or on Windows especially I think this might raise an error so let's try to set this to two in your case if you get an error here in my case I'm using two this makes the loading a little bit faster and yeah so now we have our chat data set so why did I implement this as a high-touch data set now and created a data loader and this is just because then we can automatically iterate over this and get better training so that's it for part two and then in part 3 we will implement the actual PI torch model and the training loop so see you next time

Original Description

In this Python Tutorial we build a simple chatbot using PyTorch and Deep Learning. I will also provide an introduction to some basic Natural Language Processing (NLP) techniques. 1) Theory + NLP concepts (Stemming, Tokenization, bag of words) 2) Create training data 3) PyTorch model and training 4) Save/load model and implement the chat Resource: This tutorial was inspired and adapted from the following article: "Contextual Chatbots with Tensorflow": https://chatbotsmagazine.com/contextual-chat-bots-with-tensorflow-4391749d0077 ✅ Write cleaner code with Sourcery, instant refactoring suggestions in VS Code & PyCharm: https://sourcery.ai/?utm_source=youtube&utm_campaign=pythonengineer * 📚 Get my FREE NumPy Handbook: https://www.python-engineer.com/numpybook 📓 Notebooks available on Patreon: https://www.patreon.com/patrickloeber ⭐ Join Our Discord : https://discord.gg/FHMg9tKFSN If you enjoyed this video, please subscribe to the channel! NLTK: https://www.nltk.org You can find the code on GitHub: https://github.com/patrickloeber/pytorch-chatbot PyTorch Beginner Course: https://www.youtube.com/playlist?list=PLqnslRFeH2UrcDBWF5mfPGpqQDSta6VK4 Please checkout my website to see all tutorials: https://www.python-engineer.com You can find me here: Twitter: https://twitter.com/patloeber GitHub: https://github.com/patrickloeber Icons: https://fontawesome.com/icons/comments https://fontawesome.com/icons/robot #PyTorch #NLP #DeepLearning ---------------------------------------------------------------------------------------------------------- * This is a sponsored or an affiliate link. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you so much for the support! 🙏

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Patrick Loeber · Patrick Loeber · 0 of 60

← Previous Next →

Lists in Python - Advanced Python 01 - Programming Tutorial

Lists in Python - Advanced Python 01 - Programming Tutorial

Tuples in Python - Advanced Python 02 - Programming Tutorial

Tuples in Python - Advanced Python 02 - Programming Tutorial

Dictionaries in Python - Advanced Python 03 - Programming Tutorial

Dictionaries in Python - Advanced Python 03 - Programming Tutorial

Sets in Python - Advanced Python 04 - Programming Tutorial

Sets in Python - Advanced Python 04 - Programming Tutorial

Strings in Python - Advanced Python 05 - Programming Tutorial

Strings in Python - Advanced Python 05 - Programming Tutorial

Collections in Python - Advanced Python 06 - Programming Tutorial

Collections in Python - Advanced Python 06 - Programming Tutorial

Itertools in Python - Advanced Python 07 - Programming Tutorial

Itertools in Python - Advanced Python 07 - Programming Tutorial

Lambda in Python - Advanced Python 08 - Programming Tutorial - Map Filter Reduce

Lambda in Python - Advanced Python 08 - Programming Tutorial - Map Filter Reduce

Exceptions in Python - Advanced Python 09 - Programming Tutorial

Exceptions in Python - Advanced Python 09 - Programming Tutorial

Logging in Python - Advanced Python 10 - Programming Tutorial

Logging in Python - Advanced Python 10 - Programming Tutorial

JSON in Python - Advanced Python 11 - Programming Tutorial

JSON in Python - Advanced Python 11 - Programming Tutorial

Random Numbers in Python - Advanced Python 12 - Programming Tutorial

Random Numbers in Python - Advanced Python 12 - Programming Tutorial

Decorators in Python - Advanced Python 13 - Programming Tutorial

Decorators in Python - Advanced Python 13 - Programming Tutorial

Generators in Python - Advanced Python 14 - Programming Tutorial

Generators in Python - Advanced Python 14 - Programming Tutorial

Threading vs Multiprocessing in Python - Advanced Python 15 - Programming Tutorial

Threading vs Multiprocessing in Python - Advanced Python 15 - Programming Tutorial

Threading in Python - Advanced Python 16 - Programming Tutorial

Threading in Python - Advanced Python 16 - Programming Tutorial

Multiprocessing in Python - Advanced Python 17 - Programming Tutorial

Multiprocessing in Python - Advanced Python 17 - Programming Tutorial

Function arguments in detail - Advanced Python 18 - Programming Tutorial

Function arguments in detail - Advanced Python 18 - Programming Tutorial

The asterisk (*) operator in Python - Advanced Python 19 - Programming Tutorial

The asterisk (*) operator in Python - Advanced Python 19 - Programming Tutorial

Shallow vs Deep Copying in Python - Advanced Python 20 - Programming Tutorial

Shallow vs Deep Copying in Python - Advanced Python 20 - Programming Tutorial

Context Managers in Python - Advanced Python 21 - Programming Tutorial

Context Managers in Python - Advanced Python 21 - Programming Tutorial

KNN (K Nearest Neighbors) in Python - Machine Learning From Scratch 01 - Python Tutorial

KNN (K Nearest Neighbors) in Python - Machine Learning From Scratch 01 - Python Tutorial

Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial

Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial

Logistic Regression in Python - Machine Learning From Scratch 03 - Python Tutorial

Logistic Regression in Python - Machine Learning From Scratch 03 - Python Tutorial

Linear and Logistic Regression in 60 lines of Python - Machine Learning From Scratch 04

Linear and Logistic Regression in 60 lines of Python - Machine Learning From Scratch 04

Naive Bayes in Python - Machine Learning From Scratch 05 - Python Tutorial

Naive Bayes in Python - Machine Learning From Scratch 05 - Python Tutorial

Perceptron in Python - Machine Learning From Scratch 06 - Python Tutorial

Perceptron in Python - Machine Learning From Scratch 06 - Python Tutorial

SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial

SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial

Decision Tree in Python Part 1/2 - Machine Learning From Scratch 08 - Python Tutorial

Decision Tree in Python Part 1/2 - Machine Learning From Scratch 08 - Python Tutorial

Decision Tree in Python Part 2/2 - Machine Learning From Scratch 09 - Python Tutorial

Decision Tree in Python Part 2/2 - Machine Learning From Scratch 09 - Python Tutorial

Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial

Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

K-Means Clustering in Python - Machine Learning From Scratch 12 - Python Tutorial

K-Means Clustering in Python - Machine Learning From Scratch 12 - Python Tutorial

Anaconda Tutorial - Installation and Basic Commands

Anaconda Tutorial - Installation and Basic Commands

PyTorch Tutorial 01 - Installation

PyTorch Tutorial 01 - Installation

PyTorch Tutorial 02 - Tensor Basics

PyTorch Tutorial 02 - Tensor Basics

PyTorch Tutorial 03 - Gradient Calculation With Autograd

PyTorch Tutorial 03 - Gradient Calculation With Autograd

PyTorch Tutorial 04 - Backpropagation - Theory With Example

PyTorch Tutorial 04 - Backpropagation - Theory With Example

PyTorch Tutorial 05 - Gradient Descent with Autograd and Backpropagation

PyTorch Tutorial 05 - Gradient Descent with Autograd and Backpropagation

PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer

PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer

PyTorch Tutorial 07 - Linear Regression

PyTorch Tutorial 07 - Linear Regression

PyTorch Tutorial 08 - Logistic Regression

PyTorch Tutorial 08 - Logistic Regression

PyTorch Tutorial 09 - Dataset and DataLoader - Batch Training

PyTorch Tutorial 09 - Dataset and DataLoader - Batch Training

PyTorch Tutorial 10 - Dataset Transforms

PyTorch Tutorial 10 - Dataset Transforms

Download Images With Python Automatically - Python Web Scraping Tutorial

Download Images With Python Automatically - Python Web Scraping Tutorial

PyTorch Tutorial 11 - Softmax and Cross Entropy

PyTorch Tutorial 11 - Softmax and Cross Entropy

Select Movies with Python - Web Scraping Tutorial

Select Movies with Python - Web Scraping Tutorial

PyTorch Tutorial 12 - Activation Functions

PyTorch Tutorial 12 - Activation Functions

List Comprehension in Python - A Python Feature You MUST KNOW - Python Tutorial

List Comprehension in Python - A Python Feature You MUST KNOW - Python Tutorial

PyTorch Tutorial 13 - Feed-Forward Neural Network

PyTorch Tutorial 13 - Feed-Forward Neural Network

How To Add A Progress Bar In Python With Just One Line - Python Tutorial

How To Add A Progress Bar In Python With Just One Line - Python Tutorial

PyTorch Tutorial 14 - Convolutional Neural Network (CNN)

PyTorch Tutorial 14 - Convolutional Neural Network (CNN)

The Walrus Operator - New in Python 3.8 - Python Tutorial

The Walrus Operator - New in Python 3.8 - Python Tutorial

PyTorch Tutorial 15 - Transfer Learning

PyTorch Tutorial 15 - Transfer Learning

YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1

YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1

YouTube Data API Tutorial with Python - Find Channel Videos - Part 2

YouTube Data API Tutorial with Python - Find Channel Videos - Part 2

YouTube Data API Tutorial with Python - Get Video Statistics - Part 3

YouTube Data API Tutorial with Python - Get Video Statistics - Part 3

YouTube Data API Tutorial with Python - Analyze the Data - Part 4

YouTube Data API Tutorial with Python - Analyze the Data - Part 4

AdaBoost in Python - Machine Learning From Scratch 13 - Python Tutorial

AdaBoost in Python - Machine Learning From Scratch 13 - Python Tutorial

Ultimate FREE Study Guide for Machine Learning and Deep Learning

Ultimate FREE Study Guide for Machine Learning and Deep Learning

This video tutorial teaches how to build a simple chatbot using PyTorch and Deep Learning, covering basic NLP techniques. The tutorial provides a step-by-step guide on how to create a chatbot model and train it using a dataset. By following this tutorial, viewers can learn how to apply NLP techniques and build a simple chatbot.

Key Takeaways

Load a JSON file
Tokenize and stem words
Exclude punctuation characters
Collect words and patterns
Apply stemming
Sort a list of words
Remove duplicate words from a list
Create a bag of words by tokenizing sentences and counting word frequencies
Convert labels to class labels
Tokenize sentence with new and incoming word

💡 The key insight from this tutorial is that building a simple chatbot using PyTorch and Deep Learning requires a basic understanding of NLP techniques such as tokenization, stemming, and bag of words.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related Reads

I Found the Neural Network I Built in Class 9 — Here’s What Happened When I Tried to Run It Again

Revisiting a 4-year-old neural network project for handwritten digit recognition using a convolutional neural network and analyzing its performance

Medium · Deep Learning

Introduction to Deep Learning and Neural Networks: From Human Brain to Artificial Intelligence

Learn how biological neurons inspired artificial neural networks and deep learning, transforming the AI landscape

Medium · Deep Learning

Want to get started with deep learning

Get started with deep learning by leveraging resources like Andrew Karpathy's playlist and frameworks such as TensorFlow or PyTorch

Reddit r/deeplearning

Building a Deepfake Detector From Scratch — What Nobody Tells You

Learn to build a deepfake detector from scratch and understand the challenges involved in detecting AI-generated fake media

Medium · Deep Learning

Image Classification with ml5.js

The Coding Train