Web Scraping in Python: Tools, Techniques, and Legality | Real Python Podcast #12
Do you want to get started with web scraping using Python? Are you concerned about the potential legal implications? What are the tools required and what are some of the best practices? This week on the show we have Kimberly Fessel to discuss her excellent tutorial created for PyCon 2020 online titled “It’s Officially Legal so Let’s Scrape the Web.”
We discuss getting started with web scraping, and cover tools and techniques. Kimberly gives advice on finding elements inside of the html, and techniques for cleaning your data. She also notes a recent change to the legal landscape regarding scra…
Watch on YouTube ↗
(saves to browser)
Chapters (23)
Introduction
1:31
Kimberly’s background and Metis Data Science Bootcamp
2:19
NLP and work in advertising
3:27
Changes in the legality of web scraping
6:12
What are good projects for web scraping?
6:56
Tools to start web scraping
7:51
How to find the elements you want?
9:00
How much HTML should you know?
10:49
Inspecting elements in the browser
14:30
What are good sites to practice on?
16:20
Pausing between requests
19:02
Saving as you go
20:54
Real Python Video Course Spotlight
21:55
Navigating the DOM
23:10
Data cleaning and formatting
28:26
Dynamic sites and Selenium
32:16
Scrapy
33:55
PyOhio 2020
35:40
Transition out of academia
38:40
What are you excited about in the world of Python?
41:05
What do you want to learn next in Python?
48:00
What is a less known Python tip or trick?
49:17
Thanks and Goodbye
Playlist
Uploads from Real Python · Real Python · 39 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
▶
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Python Boto3 and S3 Access Control Lists
Real Python
Python Naming Conventions
Real Python
Documenting vs Commenting Code in Python
Real Python
How to Iterate Through a Dictionary in Python
Real Python
Common Issues with Sorting in Python
Real Python
Unpacking With Asterisk Operators in Python
Real Python
Object Inheritance in Python
Real Python
Python Strings: Escape Sequences
Real Python
The Python import Statement
Real Python
Defining a Set in Python
Real Python
Conditional Statements in Python (if/else/elif)
Real Python
PDFs in Python: Watermarking and Encrypting
Real Python
Implementing a Stack in Python
Real Python
Best Practices for Python Main Functions
Real Python
A Basic PyGame Program
Real Python
About The Show (Trailer) | Real Python Podcast #0
Real Python
Python Decorators and Writing for Real Python | Real Python Podcast #1
Real Python
NumPy arange(): arange() vs range
Real Python
Learn Python Skills While Creating Games | Real Python Podcast #2
Real Python
Python's "is" and "is not" Operators
Real Python
Effective Python and Python at Google Scale | Real Python Podcast #3
Real Python
"Hello World" With Arduino (Python Setup Prep)
Real Python
Learning Python Through Errors | Real Python Podcast #4
Real Python
Inheritance in Python: Object Oriented Programming
Real Python
Exploring CircuitPython | Real Python Podcast #5
Real Python
Python Interview Question: Merge k Sorted Linked Lists With a PriorityQueue
Real Python
Python REST APIs and The Well-Grounded Python Developer | Real Python Podcast #6
Real Python
Django Web Apps: How to Structure Them
Real Python
AsyncIO + Music, Origins of Black, and Managing Python Releases | Real Python Podcast #7
Real Python
Python's print() function: Python 2 vs 3
Real Python
Docker + Python for Data Science and Machine Learning | Real Python Podcast #8
Real Python
Mock Objects: Improve Your Testing in Python
Real Python
Leveling Up Your Python Literacy and Finding Projects to Study | Real Python Podcast #9
Real Python
Representing Integers in Python
Real Python
Python Job Hunting in a Pandemic | Real Python Podcast #10
Real Python
Getting Started with Pip and PyPI in Python
Real Python
Advice on Getting Started With Testing in Python | Real Python Podcast #11
Real Python
Python's zip() Function for Parallel Iteration
Real Python
Web Scraping in Python: Tools, Techniques, and Legality | Real Python Podcast #12
Real Python
How Do You Read a Python Traceback?
Real Python
PDFs in Python and Projects on the Raspberry Pi | Real Python Podcast #13
Real Python
Generators in Python
Real Python
Going Serverless with Python | Real Python Podcast #14
Real Python
Make a Discord Bot With Python
Real Python
Python Regular Expressions, Views vs Copies in Pandas, and More | Real Python Podcast #15
Real Python
Unicode in Python
Real Python
Thinking in Pandas: Python Data Analysis the Right Way | Real Python Podcast #16
Real Python
Variables in Python vs C
Real Python
Linear Programming, PySimpleGUI, and More | Real Python Podcast #17
Real Python
Creating an Intermediate Python Project: Choosing a Platform
Real Python
Ten Years of Flask: Conversation With Creator Armin Ronacher | Real Python Podcast #18
Real Python
Dealing With Python Time in Seconds
Real Python
Advanced Python Import Techniques and Managing Users in Django | Real Python Podcast #19
Real Python
What is None and How to Test for It
Real Python
Building PDFs in Python with ReportLab | Real Python Podcast #20
Real Python
Getting a Directory Listing in Python
Real Python
Exploring K-means Clustering and Building a Gradebook With Pandas | Real Python Podcast #21
Real Python
Misspelling, Missing, or Misusing Python Keywords: Invalid Python Syntax
Real Python
Create Cross-Platform Python GUI Apps With BeeWare | Real Python Podcast #22
Real Python
Real Python Office Hours -- Welcome Trailer
Real Python
DeepCamp AI