Streaming from Reddit - Python Reddit API Wrapper (PRAW) tutorial p.3
Key Takeaways
This video tutorial demonstrates how to use the Python Reddit API Wrapper (PRAW) for streaming data from Reddit, handling exceptions, and setting up live applications such as bots and alerts. It covers the basics of streaming comments and submissions, searching for comments by ID, and using PRAW to stream content from subreddits.
Full Transcript
what's going on everybody Welcome to part three of the Python Reddit API rapper or Pro tutorial Series in this video we're going to be talking about is streaming from Reddit so at up to this point we've basically just done everything kind of historically as if you rev viewing Reddit but as you may know um sometimes when you post stuff on Reddit you'll get like an immediate reply back from a bot or um or even like some of like the auto mods and stuff like that um but also you might want to maybe you're wanting to uh keep a database up to date or maybe you're trying to um set up like an alert for something uh who knows there's all kinds of reasons why you might actually want to stream Reddit rather than looking back historically also streaming uh for a lot of applications is going to be less like API call intensive than it would be to you know keep making API calls constantly so um so let's go ahead and cover how to actually stream it's actually super simple you just add a DOT stream in front of everything so for example what we could say is uh for comment so in this case we're going to still continue with subreddit but we're just starting some new new text so for comment in subreddit do stream. comments now you can also do stream on other things uh if I forget to mention it you could do it on a specific even submission if you wanted um anyway for comment in subreddit Dost stream. comments what do we want to do um let's go ahead and and um we can say print actually let's do trying to decide how I want to do this let's say we want to grab a parent ID and that'll be um the string version of comment. parent and then what we want to do is for comment in subreddit dstream doc comment so what we're going to say now is we're going to say like basically what I want to do is grab let's say like we want to grab all the replies to comments so we need a parent ID and there might not be one CU I I actually think top level comments don't have a parent ID that is the thread um so I might have been incorrect in my statements before but anyways Let's do let's just incase this in a try accept exception see and we're going to pass people are going to be so mad now um so that's a parent ID now say let's say like the I don't know submission I hate to use the word submission cuz that's like a thread um original equals um Reddit do comment so this will let us search for a specific comment by ID so we can search for it by the parent ID so um cool so we got the original comment and then what we're going to say is so we can just print it out so let's just do print parent print original um and actually original again would be a uh a pro object so we need to do body on that um and then print uh reply and reply would be comment. [Music] body and we could even we could be a little better actually um rather than accept exception as Z let's do raw do exceptions do raw exception as e we're we're we're a little better but not much so uh so now we can go ahead and stream um these comments I think that's probably good enough we might have to change the python subreddit but we'll see so some of these probably had no parents is my guess did we print parent before H it's weird that we would get to parent and it would be totally empty cuz you wouldn't have gotten an ID I wouldn't think or are they really empty no it's just it takes time to generate me no those are empty it's kind of odd I'm not quite positive why a parent would be empty like that anyway um we're getting parent here shouldn't be hard for you to provide 40 character string blah blah blah blah blah and then you've got your reply I doubt you'd be able to crack it even with an ec2 instance okay um how big of an ec2 instance bro any welcome to Reddit so um okay so that's how you could stream in um comments now one thing I'll I'll bring up that I didn't realize out of the gate was um or a couple things first of all don't forget that um every time basically you've got this a function call um that's another you know strike against your your your total query limit basically your API call limit which is an extremely unfortunate 30 API calls per minute boo that sucks now initially you might think it's actually much larger than 30 especially if you like use it a little bit stop and then use it a little bit again um it's almost like like I don't know maybe they average it over the course of an hour or something like that um but initially you'll get a huge Spike and then it slowly will level out to an average almost knocked over my coffee an average of like 30 requests a minute um so and I forget where that is that's somewhere in the docks apparently I had to Google it to find that limit but there is that limit um which honestly is pretty low especially when you don't provide the parent ID I can understand why you wouldn't want to uh um Supply the parent content or something like that but seriously if it has a parent ID give it to me um so that's kind of that's kind of a bummer like all requests are equal um so not happy about that but anyways so this here is an API call this here is an API call cuz you're you're you're extracting you know you've got to go find out that parent ID this should not be an API call but it is anyway that's an API call and then the stream I don't know how frequently it sends you new data but it's some somewhat frequently but what we can do just to kind of show you this though I'm going to go ahead and just comment um this out that's I swear I hit all three but apparently I didn't and even this and in fact I'm just going to delete all this so like comment. body oops we want to print comment. body print comment. body it's not a function call try again bro okay so this is just stream comment so anyway if I let it go just as fast as it wants um okay it's actually kind of stopped um that's probably because um the python subreddit is kind of a slow moving subreddit so let's change this to something faster news strion this I I expect up no what this is news to me [Music] oh man I'm super tempted to just pass entirely let's go to politics that's the kind of error I don't want to debug right now sorry guys let's try again maybe because I'm what no D this is news to me I've streamed this without any problems what the hell is going on I'm going to just going to do the the worst thing could do man let's see if that still does it like is this like an idol issue no I don't know anyway if someone has a a solution to that that error let me know that was disgusting anyway in case it in a full exception just to let this tutorial continue on its role um anyway this is streaming contents from now politics um if I went back to news I'm pretty sure news is a much larger subreddit than politics let me check real quick so 22,000 people in news right now politics has Dr please wow 33,000 it's actually bigger than news it's interesting anyway um okay so that's streaming all of the contents as they come in it tends to stream in like a cluster so it looks like maybe it's just like one update every few seconds but if you're paying full attention it's actually a cluster of like a bunch like I don't even know how many of this is but like let's see okay this ends with like these um wow thanks for helping me prove my point guys um okay that one bulked up pretty high okay it's actually not too many um when I've streamed this in the past like it seemed like it was coming in as like 20 at a time and now it's kind of early in the morning so maybe this is not like the time for Reddit um come back around like 6:00 p.m. Eastern or something and Reddit might even be down so anyway um closing out um so not only can you do um like a stream of comments you can also stream uh an actual just the submission so for example we could say subreddit um Dost stream and then we could do submissions this one might be a little slower um but let's oops not going to be comment well I guess it would be no we'd have to do title so so we're just hitting a bunch of exceptions and we're passing them silently cuz we are good programmers right uh for submission in subreddit Str submissions print submission. [Music] tile let's at least sort of fix our problem there save and run surely whoa okay so when you go to do this stream it'll like kind of populate back for you a little bit um and then it goes now I'm hoping to see an update here surely somebody's going to submit something to politics you guys can do it I guess not anyway um just know it's a possibility so um I think that's it for the um subreddit uh or rather the Reddit API rapper um if you have any questions comments concerns on it feel free to let me know um otherwise what I plan to do with this is I wanted to make a chatbot now I was going to stream everything uh from Reddit to a database um but Reddit did not appear to be a fan of what I was doing so so I have to come up with another means now I did find now just in case um just for the sake of clarity uh the the pr tutorial is over but if you are interested in pulling a t of data from Reddit and you don't want to make Reddit mad you can say um let's say it's 1.7 billion Reddit comments something like that if you just Google that so here we go so there's this um post that was made two years ago um this guy pulled I don't know how he did it without making Reddit angry but somehow so it's every publicly available Reddit comment for research although I don't know who this this person is stuck in The Matrix may I don't this might be someone very someone affiliated with with uh look at that Karma someone affiliated with Reddit I don't know someone someone can feel free to educate me uh anyway um somehow they have all the the the the Reddit comments so anyways at this point it was up to 2015 so it's a 1.7 billion comments um everything now if you scroll on down um somewhere in here someone also posts that they have it all on big query so Google's big query you can click on this comment here boom someone is telling us hey they loaded everything onto big query now if we go to big query uh wait for it oh I was going to check to see if we got any updates whatever you can see starting at 2005 we've got a table but it actually goes all the way to 2017 um June 2017 so I feel like I'm under the impression that this is going to keep updating maybe it'll be like two two month lagged I'm not really sure but anyway um so it's even more data and Reddit has only grown like Reddit is one of the top 10 websites I mean it sometimes it comes in and out of the top 10 but I mean it it's a huge website so as it's growing I mean like probably 27 2016 maybe 2015 probably has just as much data as 2005 to at least 2010 probably does anyways um all the data is here on big query you can also uh what I was showing you initially was you can download the torrent of 2005 to 2015 actually I believe it's not even it's not 2005 I think it goes back to uh 2007 I might be wrong somewhere now this was loaded on to Big query if I go back one more time um yeah it's here um and then he says that it is obviously it's 1.7 billion comments um and it's like one terab or 2 terabytes so in here I swear he says it's only one terab but then he says here he wanted to get a digital ocean box with 2 terabytes of bandwidth oh I guess so people could download it yeah cuz it's like 150 gigabytes somewhere here it has the extracted size I think it's a terabyte anyway but yeah you can torent that grab that um and just have a ball or grab just a month of that data if you wanted anyways just throwing that out there for anybody who's interested so you don't run into the same problems as I run into um so I think that's it questions comments leave me below otherwise till the next tutorial
Original Description
At this point, we've got basically all we could need for historical parsing, but what about for live applications? Maybe you want to keep a database up to date, or maybe you want to setup alerts for topics, or even you want to create a live bot that actually responds to comments. In these cases, you probably wouldn't want to be constantly pinging subreddits for changes, you'd rather have them streaming live, which we can also do with the Python Reddit API Wrapper.
Text tutorials and sample code: https://pythonprogramming.net/streaming-python-reddit-api-wrapper-praw-tutorial/
https://twitter.com/sentdex
https://www.facebook.com/pythonprogramming.net/
https://plus.google.com/+sentdex
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from sentdex · sentdex · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Matplotlib Python Tutorial Part 1: Basics and your first Graph!
sentdex
Python Encryption Tutorial with PyCrypto
sentdex
Python's Logging Function
sentdex
wxPython Tutorials 1: Making Windows GUIs with Python : Installing + 1st window!
sentdex
wxPython Tutorials 2: Making Windows GUIs with Python: Customizing Window Parameters
sentdex
wxPython Programming Tutorial 3: Menu Bar and Menu Button
sentdex
wxPython Programming Tutorial 4: Panels
sentdex
wxPython Programming Tutorial 5: User Input Saved To Variables
sentdex
wxPython Programming Tutorial 6: Multiple Choice Input
sentdex
wxPython Programming Tutorial 7: Adding Static Text and Colors
sentdex
wxPython Programming Tutorial 8: Custom Button Images
sentdex
wxPython Programming Tutorial 9: Tool Bar Items and Sub Menus!
sentdex
Basic PHP Tutorial 13: Multi-dimensional Array
sentdex
Basic PHP Tutorial 15: Functions and Global Variables
sentdex
Basic PHP Tutorial 12: Associative Array
sentdex
Basic PHP Tutorial 14: Foreach loop
sentdex
Basic PHP Tutorial 16: Include and Require
sentdex
Basic PHP Tutorial 7: Assignment, comparison and Logical operators
sentdex
Basic PHP Tutorial 4: Variables and Comments
sentdex
Basic PHP Tutorial 11: Arrays part 1, basic array
sentdex
Basic PHP Tutorial 6: If else and else if conditionals cont'd
sentdex
Basic PHP Tutorial 1: Intro to PHP
sentdex
Basic PHP Tutorial 3: HTML with PHP
sentdex
Basic PHP Tutorial 9: While Loop
sentdex
Basic PHP Tutorial 10: Switch Statement
sentdex
Basic PHP Tutorial 2: Print and Echo
sentdex
Basic PHP Tutorial 5: If else and else if conditional statements
sentdex
Basic PHP Tutorial 8: Arithmatic Operators: Doing math with php
sentdex
Basic PHP Tutorial 17: User Input Form Example / String Manipulation
sentdex
Basic PHP Tutorial 18: HTML Entities and forms cont'd
sentdex
Basic PHP Tutorial 19: Finding words in strings
sentdex
Basic PHP Programming Tutorial 20: Saving to a File / writing and appending
sentdex
Basic PHP Programming Tutorial 22: Hashing part 2: salting
sentdex
Basic PHP Programming Tutorial 23: Variables in Strings and tokenizing
sentdex
Basic PHP Programming Tutorial 21: MD5 Hashing For Security
sentdex
Basic PHP Programming Tutorial 24: String similarity
sentdex
Basic PHP Programming Tutorial 25: Time and Time stamps
sentdex
Basic PHP Programming Tutorial 26: Die and Exit
sentdex
Basic PHP Programming Tutorial 27: MySQL Databases Part 1
sentdex
Basic PHP Programming Tutorial 28: MySQL Database Part 2: Reading From Database
sentdex
Basic PHP Programming Tutorial 29: MySQL Database Part 3: Inputting Data
sentdex
Basic PHP Programming Tutorial 30: MySQL database in Use
sentdex
Django Tutorial Web Development with Python Part 1: Installing Django
sentdex
Python Tutorial: File Deletion and Folder Deletion / directory deletion
sentdex
Python Tutorial: How to Rename Files and Move Files with Python
sentdex
3D Graphs in Matplotlib for Python: Basic 3D Line
sentdex
3D Plotting in Matplotlib for Python: 3D Scatter Plot
sentdex
3D Charts in Matplotlib for Python: Multiple datasets scatter plot
sentdex
Sikuli Tutorial 1: Visually programming in python!
sentdex
Sikuli Tutorial 2: Program visually in python!
sentdex
Sikuli Tutorial 3: Program visually in python!
sentdex
3D Bar Charts in Python and Matplotlib
sentdex
3D Plane wire frame Graph Chart in Python
sentdex
Raspberry Pi Part 1 Introduction
sentdex
Raspberry Pi Part 8: First Download and Update! (Firmware)
sentdex
Raspberry Pi Part 10: How to set up a Linux Web Server on your Pi
sentdex
Raspberry Pi Part 11: Remote Desktop
sentdex
Twitter Analysis: How to rank a user's influence
sentdex
GPIO Tutorial for Pi Part 2 - Programming the GPIO
sentdex
GPIO Tutorial for Raspberry Pi Part 1 - Setting up
sentdex
More on: Prompt Craft
View skill →Related Reads
📰
📰
📰
📰
atob() can't decode a JWT — the Base64URL gotcha (and the fix)
Dev.to · Daniel Cheong
Why Debugging Made Me a Better Developer
Medium · JavaScript
Mapping Go Domain Errors to HTTP Status Codes at the Boundary
Dev.to · Gabriel Anhaia
The dual-write problem in NestJS, solved with Drizzle: a transactional outbox + idempotent inbox
Dev.to · Rodrigo Nogueira
🎓
Tutor Explanation
DeepCamp AI