Building a Reddit Keyword Research Chatbot
Key Takeaways
This video demonstrates how to use Watson Discovery and Watson Assistant to scrape data from Reddit for keyword research and build a chatbot, leveraging tools like IBM Cloud Catalog and natural language search.
Full Transcript
have you ever struggled trying to work out what to write about or maybe what to blog about or even what YouTube videos to create well today we're gonna solve them in today's tutorial we're going to be going through how to perform keyword research using what's in discovery right not if you don't know how to code this tutorial is perfect for you it's a code free example of how to scrape data off reddit to do some keyword research and then we're actually going to take that same data and integrate it into what's an assistant so we have a full-blown pipeline to improve our keyword research let's get to it the first thing that we want to do is get some data so given our challenge is to create a keyword research chat bot we obviously need some data to do keyword research with so in this particular case we're going to be working with the Python subreddit so what we're going to need to do is get some data from the Python subreddit so let's just go to so I'm currently at reddit comm so and then we can just go to Python and you can see we've got the Python subreddit here so what we're going to be doing in order to collect this data is we're going to be using Watson discovery so Watson discovery has an inbuilt web scraper that's going to allow us to collect some of this information that we've got on this page so if we go into the IBM cloud catalog so it can be accessed by going to cloud ibm.com and selecting catalog we're able to access a watson discovery service so in this particular case we just need to go to catalog go on and navigate to services then AI and then head on over to discovery so discovery is the first part of the equation we're going to be using what's an assistant later on but we'll come back to that so let's select discovery and then just choose the light plan more than enough for now so the light plan will allow you to scrape up to a thousand web pages per month so in this case we've got a thousand documents per month more than enough so just select light and hit create and so if we scroll down to our services you can see that we've got a service there so this is the one that we're gonna want you can see it's still provisioning so as soon as that's finished provisioning will be able to access that service ok so we can say that our discovery service is now being provisioned showing up as active oh great let's choose that service and from here we just need to select launch Watson discovery so this is going to take us to a workspace dashboard and from here we can do a whole bunch of good stuff so Watson discovery is a really powerful cognitive search engine but it also gives you the ability to collect data as well so in this case what we want to do is connect the data source and just set up with our current plan for now hit continue what we're going to do is use the web crawl feature so because we're crawling reddit all we need to do is just grab some data of Reddit so let's choose web crawl and then we're going to grab a URL so in this case it's reddit.com slash are slash Python and we're going to paste that into here so if you wanted to crawl a bunch of different websites you could do that as well so I'm just gonna add reddit but if you wanted to add others for example artificial intelligence machine learning whole bunch of others you can just add in the additional URLs that you want to crawl as well you can also specify how often you want discovery to crawl so if you wanted to crawl once a week you can choose once a week you can choose once a day once an hour once a month be careful how much you're crawling like you obviously don't want to slam a website but once we should be fine and we can also choose the language of content that we're going to extract because as soon as Watson discoveries portal document it's going to start processing it and doing some natural language processing to extract pieces of information from that data now there's one last thing that we want to do before we trigger our web crawler and we want to exclude some specific pages from this crawl so rather than pulling absolutely everything from reddit we want to exclude some other pages that tend to make our data a little bit fuzzy so so the first thing that we want to do is specify the maximum number of hops this is the number of links that discovery is going to follow so if it finds a link within a page it might go to that page that's one hop if it finds a link within that page it might go to that page that's another hop so in this case we only really want to take our high level web pages so we're gonna specify the number of hops to one and we're going to exclude some pages so there's a whole bunch of URLs that are going to give us repetitive results and we want to exclude those I've gone ahead and grabbed the list of these that you can just tag in so the first one is search the second one is flare then top next one is hot new rising and that should be about it so this is going to any page that has this inside of their URL we're going to be excluding that from our web crop so we can then hit submit and hit save and sync so all things going well we should be able to crawl some data from the reddit site now this might take a little while for the crawl to complete so as soon as it's done you'll start to see results here while that's happening let's rename our discovery environment so we can just change this from web crawl blah blah blah to by discovery web crawl read it by that and hit safe I mean it'll a tail all right and we're back so you can see that Discovery's now retrieved some results from reddit we've got 17 documents and we've also got some entities and concepts that have been extracted now we're not going to spend too long discovering what's in discovery but I will quickly show you how to query so if we select the search button we can search for documents and then say we wanted to look at all the examples of articles that had Python in we can just type in Python and this uses natural language query to go through to our discovery documents and bring maps and results and you can see so we've got a whole bunch of Python results so particle physics in PI game small Python script trained a deep learning model so you can see that it's actually returned our whole bunch of results that it's actually gone and grabbed from reddit so this gives you a bit of a heads up as to how to perform some keyword research so you're you start to get a better understanding of the trending topics that are on a specific reddit page now that doesn't conclude today's tutorial we also need to integrate this into a chat bot so to do that we'll go back to our IBM cloud account chip catalog and then remember I said before we're going to be using what's an assistant so if we select services then a I and then what's an assistant now we're going to be using a specific type of skill within our assistant called a search skill now search skill is available inside of the plus trial or the plus and premium plan so in this case if you don't have a service already just select the plus trial and hit create I've already got one created so let's step over into that so if you've just created your assistant and spun up a new service you'll automatically be redirected to this manage page from here all you need to do is let launch Watson assistant now that we're at our assistant so all we need to do is hit create assistant we're going to going to call this particular assistant reddit keyword research then hit create assistant then from here it's pretty easy all we need to do is associate our discovery web search to our assistant to do that all we need to do is select add search skill and what we're going to do is create a new skill and call it reddit search let's call it reddit python search and hit continue now because we've already got our Watson discovery instance set up all we need to do is choose that from here remember our instance was called discovery - 7q so we just need to select that and you can see our discovery web crawl the reddit Python environment which we named up here is automatically showing up we just need to have that selected and hit configure then from this page we just need to configure what we want our results to look like when we query our chat bot so you can see here that we've automatically got it pretty much pre setup so it's extracting our title and setting that as a title it's extracting the body and setting that as the body and it's also extracting the URL and including that as well so ideally when you set this up every time you query this search skill you're going to get a result that looks a little bit like this and gives you a better heads up as to what keywords you need to be targeting so for this we're just going to hit create so now that we've created our search skill what we're going to do is just step back into it and hit try it and from here we can actually test out how well our results up so say the first thing that we wanted to query is the obvious so Python all we need to do is type in Python and what's happening is our chat bot is now going out to our Watson discovery service and getting results from our Python web search so you can see that we're actually returning some results that we had within our Python documents so that's Python a pretty obvious one what about machine learning so using discovery we've been able to crawl our data from reddit and get some results and then by integrating it into what's an assistant using a search skill we've now been able to surface those keyword research results to our users and to yourself now if you wanted to you could then go and integrate this into a web page or integrate it into a client tool if you wanted to allow them to get some ideas on keyword research going forward that about wraps up this tutorial thanks so much for tuning in thanks for tuning in guys hopefully you found it useful if you like the video be sure to like share and subscribe and if you've got any questions at all be sure to drop a comment in the comments below peace you
Original Description
Working out what to write about can be a nightmare.
This applies if you're creating YouTube videos, blogging, and creating just about any type of content.
This doesn't need to be so hard.
This video runs through how to scrape data from Reddit to perform better keyword research. I'll go through:
- How to use web scraping to get data from Reddit using Watson Discovery
- How to setup a basic chatbot that integrates to Watson Discovery
- How to use Watson Assistant to query your reddit data
Want to learn more about it all:
Watson Assistant: https://www.ibm.com/cloud/watson-assistant/
Watson Discovery: https://www.ibm.com/cloud/watson-discovery
Get FREE stuff!
Free Trials: https://cloud.ibm.com/catalog/services/watson-assistant
Oh, and don't forget to connect with me!
LinkedIn: https://www.linkedin.com/in/nicholasrenotte
Facebook: https://www.facebook.com/nickrenotte/
GitHub: https://github.com/nicknochnack
Happy coding!
Nick
P.s. Let me know how you go and drop a comment if you need a hand!
Music: Ark · Ship Wrek · Zookeepers (https://youtu.be/rgxfky2vqw4)
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Nicholas Renotte · Nicholas Renotte · 30 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
▶
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Face Detection - Build An Image Classifier with IBM Watson - Part 7
Nicholas Renotte
Food Image Classification - Build An Image Classifier with IBM Watson - Part 6
Nicholas Renotte
General Image Classification - Build An Image Classifier with IBM Watson - Part 5
Nicholas Renotte
Installing Watson Developer Cloud - Build An Image Classifier with IBM Watson - Part 4
Nicholas Renotte
Generating Credentials - Build An Image Classifier with IBM Watson - Part 3
Nicholas Renotte
Creating A Service - Build An Image Classifier with IBM Watson - Part 2
Nicholas Renotte
Getting an IBMid - Build An Image Classifier with IBM Watson - Part 1
Nicholas Renotte
How to Analyse Review Data - Part 2 - Python Yelp Sentiment Analysis
Nicholas Renotte
How to Lemmatize Text - Part 4 - Python Yelp Sentiment Analysis
Nicholas Renotte
How to Calculate Sentiment Using TextBlob - Part 5 - Python Yelp Sentiment Analysis
Nicholas Renotte
How to Collect Business Reviews Using Python - Part 1 - Python Yelp Sentiment Analysis
Nicholas Renotte
How to Clean Text Based Data for NLP - Part 3 - Python Yelp Sentiment Analysis
Nicholas Renotte
How to Setup a IBM Watson Personality Insights Service - Part 1 - Watson Personality Insights
Nicholas Renotte
How to Create a Customer Profile with IBM Watson - Part 2 - Watson Personality Insights
Nicholas Renotte
Visualising The Profile Part 3 Watson Personality Insights
Nicholas Renotte
How to Plot Personality Insights Features at Lightspeed - Part 4 - IBM Watson Personality Insights
Nicholas Renotte
Getting Started With IBM Watson Studio Machine Learning - Part 1 - Predicting Used Car Prices
Nicholas Renotte
Upload and Visualize Data In IBM Watson Studio - Part 2 - Predicting Used Car Prices
Nicholas Renotte
Clean Data and Feature Engineer in IBM Watson Studio - Part 3 - Predict Used Car Prices
Nicholas Renotte
Using Watson Model Builder to Predict Car Prices - Part 4 - Predicting Used Car Prices
Nicholas Renotte
Deploy and Make Predictions With Watson Studio - Part 5 - Predicting Used Car Prices
Nicholas Renotte
Getting Started With IBM Watson Discovery - Part 1 - Stock News Crawler
Nicholas Renotte
How to Run Advanced Queries with Watson Discovery - Part 5 - Stock News Crawler
Nicholas Renotte
How to Run Search Queries with IBM Watson Discovery - Part 4 - Stock News Crawler
Nicholas Renotte
How to Understand the Watson Discovery Data Schema - Part 3 - Stock News Crawler
Nicholas Renotte
How to Build a Watson Discovery Web Crawler - Part 2 - Stock News Crawler
Nicholas Renotte
AI learns what to do next using Tensorflow and Python
Nicholas Renotte
Chatbot Crash Course for Absolute Beginners - Full 20 Minute Tutorial
Nicholas Renotte
Shopify Customer Service Chatbot using Python Automation
Nicholas Renotte
Building a Reddit Keyword Research Chatbot
Nicholas Renotte
Chatbot App Tutorial with Javascript Node.js [Part 1]
Nicholas Renotte
Javascript Chatbot From Scratch with React.Js [Part 2]
Nicholas Renotte
Predicting Churn with Automated Python Machine Learning
Nicholas Renotte
Sales Forecasting in Excel with Machine Learning and Python Automation
Nicholas Renotte
Automate Budgeting with Python and Planning Analytics
Nicholas Renotte
AI vs Machine Learning vs Deep Learning vs Data Science
Nicholas Renotte
Optimizing Marketing Spend using Linear Programming || Marketing Opt PT.1
Nicholas Renotte
Solving Optimization Problems with Python Linear Programming
Nicholas Renotte
Loading Data into Planning Analytics with Python || Marketing Opt PT.2
Nicholas Renotte
Building Marketing Dashboards with Planning Analytics Workspace || Marketing Opt PT.3
Nicholas Renotte
Optimizing Resource Allocation with Docplex and Planning Analytics || Marketing Opt PT.4
Nicholas Renotte
Exploratory Data Analysis With Pandas || Python Machine Learning PT.1
Nicholas Renotte
Preparing Pandas Dataframes for Machine Learning || Python Machine Learning PT.2
Nicholas Renotte
Python Machine Learning with Scikit Learn - Regression || Python Machine Learning PT.3
Nicholas Renotte
Deploying Machine Learning Models with Watson Machine Learning || Python Machine Learning PT.4
Nicholas Renotte
Mind Blowing Machine Learning Apps with Node.JS and Watson Machine Learning || Python ML PT.5
Nicholas Renotte
Build FAST Machine Learning Apps with Javascript React.Js and Watson || Python ML PT.6
Nicholas Renotte
Analyzing Twitter Accounts with Python and Personality Insights
Nicholas Renotte
Converting Speech to Text in 10 Minutes with Python and Watson
Nicholas Renotte
Build a Face Mask Detector in 20 Minutes with Watson and Python
Nicholas Renotte
AI Text to Speech in 10 Minutes with Python and Watson TTS
Nicholas Renotte
Pandas for Data Science in 20 Minutes | Python Crash Course
Nicholas Renotte
Language Translation and Identification in 10 Minutes with Python and Watson AI
Nicholas Renotte
Analyse ANY Conversation in 10 Minutes with Python and Watson Tone Analyser
Nicholas Renotte
Deep Reinforcement Learning Tutorial for Python in 20 Minutes
Nicholas Renotte
NumPy for Beginners in 15 minutes | Python Crash Course
Nicholas Renotte
Real Time Pose Estimation with Tensorflow.Js and Javascript
Nicholas Renotte
Transcribe Video to Text with Python and Watson in 15 Minutes
Nicholas Renotte
Serverless Functions for TM1/Planning Analytics in 20 Minutes
Nicholas Renotte
Building a AI Budget Bot for Planning Analytics with Watson Assistant in 20 Minutes
Nicholas Renotte
More on: AI Marketing
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Question: Why do Web Devs associate more crawling with better ranking?
Reddit r/webdev
Arbeitszeiterfassung 2026: Unternehmen müssen sich jetzt auf die Pflichten vorbereiten
Dev.to AI
The Reputation Economy Is Broken
Medium · AI
What Is Digital Sovereignty and How Is It Changing the Future of Search?
Medium · SEO
🎓
Tutor Explanation
DeepCamp AI