Web scraping in Python (Part 3): Building a dataset

Data School · Beginner ·🛠️ AI Tools & Apps ·8y ago
This is part 3 of an introductory web scraping tutorial. In this video, we'll create a structured dataset from a New York Times article using Python's Beautiful Soup library. Watch the 4-video series: https://www.youtube.com/playlist?list=PL5-da3qGB5IDbOi0g5WFh1YPDNzXw4LNL == RESOURCES == Download the Jupyter notebook: https://github.com/justmarkham/trump-lies New York Times article: https://www.nytimes.com/interactive/2017/06/23/opinion/trumps-lies.html Beautiful Soup documentation: https://www.crummy.com/software/BeautifulSoup/bs4/doc/ == DATA SCHOOL VIDEOS == Machine learning with scikit…
Watch on YouTube ↗ (saves to browser)

Playlist

Uploads from Data School · Data School · 59 of 60

1 Setting up Git and GitHub
Setting up Git and GitHub
Data School
2 Navigating a GitHub Repository - Part 1
Navigating a GitHub Repository - Part 1
Data School
3 Forking a GitHub Repository
Forking a GitHub Repository
Data School
4 Creating a New GitHub Repository
Creating a New GitHub Repository
Data School
5 Copying a GitHub Repository to Your Local Computer
Copying a GitHub Repository to Your Local Computer
Data School
6 Syncing Your GitHub Fork
Syncing Your GitHub Fork
Data School
7 Allstate Purchase Prediction Challenge on Kaggle
Allstate Purchase Prediction Challenge on Kaggle
Data School
8 Troubleshooting: Updates Rejected When Pushing to GitHub
Troubleshooting: Updates Rejected When Pushing to GitHub
Data School
9 Hands-on dplyr tutorial for faster data manipulation in R
Hands-on dplyr tutorial for faster data manipulation in R
Data School
10 ROC Curves and Area Under the Curve (AUC) Explained
ROC Curves and Area Under the Curve (AUC) Explained
Data School
11 Going deeper with dplyr: New features in 0.3 and 0.4 (tutorial)
Going deeper with dplyr: New features in 0.3 and 0.4 (tutorial)
Data School
12 What is machine learning, and how does it work?
What is machine learning, and how does it work?
Data School
13 Setting up Python for machine learning: scikit-learn and Jupyter Notebook
Setting up Python for machine learning: scikit-learn and Jupyter Notebook
Data School
14 Getting started in scikit-learn with the famous iris dataset
Getting started in scikit-learn with the famous iris dataset
Data School
15 Training a machine learning model with scikit-learn
Training a machine learning model with scikit-learn
Data School
16 Comparing machine learning models in scikit-learn
Comparing machine learning models in scikit-learn
Data School
17 Data science in Python: pandas, seaborn, scikit-learn
Data science in Python: pandas, seaborn, scikit-learn
Data School
18 Selecting the best model in scikit-learn using cross-validation
Selecting the best model in scikit-learn using cross-validation
Data School
19 How to find the best model parameters in scikit-learn
How to find the best model parameters in scikit-learn
Data School
20 How to evaluate a classifier in scikit-learn
How to evaluate a classifier in scikit-learn
Data School
21 What is pandas? (Introduction to the Q&A series)
What is pandas? (Introduction to the Q&A series)
Data School
22 How do I read a tabular data file into pandas?
How do I read a tabular data file into pandas?
Data School
23 How do I select a pandas Series from a DataFrame?
How do I select a pandas Series from a DataFrame?
Data School
24 Why do some pandas commands end with parentheses (and others don't)?
Why do some pandas commands end with parentheses (and others don't)?
Data School
25 How do I rename columns in a pandas DataFrame?
How do I rename columns in a pandas DataFrame?
Data School
26 How do I remove columns from a pandas DataFrame?
How do I remove columns from a pandas DataFrame?
Data School
27 How do I sort a pandas DataFrame or a Series?
How do I sort a pandas DataFrame or a Series?
Data School
28 How do I filter rows of a pandas DataFrame by column value?
How do I filter rows of a pandas DataFrame by column value?
Data School
29 How do I apply multiple filter criteria to a pandas DataFrame?
How do I apply multiple filter criteria to a pandas DataFrame?
Data School
30 Your pandas questions answered!
Your pandas questions answered!
Data School
31 How do I use the "axis" parameter in pandas?
How do I use the "axis" parameter in pandas?
Data School
32 How do I use string methods in pandas?
How do I use string methods in pandas?
Data School
33 How do I change the data type of a pandas Series?
How do I change the data type of a pandas Series?
Data School
34 When should I use a "groupby" in pandas?
When should I use a "groupby" in pandas?
Data School
35 How do I explore a pandas Series?
How do I explore a pandas Series?
Data School
36 How do I handle missing values in pandas?
How do I handle missing values in pandas?
Data School
37 What do I need to know about the pandas index? (Part 1)
What do I need to know about the pandas index? (Part 1)
Data School
38 What do I need to know about the pandas index? (Part 2)
What do I need to know about the pandas index? (Part 2)
Data School
39 How do I select multiple rows and columns from a pandas DataFrame?
How do I select multiple rows and columns from a pandas DataFrame?
Data School
40 Machine Learning with Text in scikit-learn (PyCon 2016)
Machine Learning with Text in scikit-learn (PyCon 2016)
Data School
41 When should I use the "inplace" parameter in pandas?
When should I use the "inplace" parameter in pandas?
Data School
42 How do I make my pandas DataFrame smaller and faster?
How do I make my pandas DataFrame smaller and faster?
Data School
43 How do I use pandas with scikit-learn to create Kaggle submissions?
How do I use pandas with scikit-learn to create Kaggle submissions?
Data School
44 More of your pandas questions answered!
More of your pandas questions answered!
Data School
45 How do I create dummy variables in pandas?
How do I create dummy variables in pandas?
Data School
46 How do I work with dates and times in pandas?
How do I work with dates and times in pandas?
Data School
47 How do I find and remove duplicate rows in pandas?
How do I find and remove duplicate rows in pandas?
Data School
48 How do I avoid a SettingWithCopyWarning in pandas?
How do I avoid a SettingWithCopyWarning in pandas?
Data School
49 How do I change display options in pandas?
How do I change display options in pandas?
Data School
50 How do I create a pandas DataFrame from another object?
How do I create a pandas DataFrame from another object?
Data School
51 How do I apply a function to a pandas Series or DataFrame?
How do I apply a function to a pandas Series or DataFrame?
Data School
52 Getting started with machine learning in Python (webcast)
Getting started with machine learning in Python (webcast)
Data School
53 Q&A about Machine Learning with Text (online course)
Q&A about Machine Learning with Text (online course)
Data School
54 Your pandas questions answered! (webcast)
Your pandas questions answered! (webcast)
Data School
55 Machine Learning with Text in scikit-learn (PyData DC 2016)
Machine Learning with Text in scikit-learn (PyData DC 2016)
Data School
56 Write Pythonic Code for Better Data Science (webcast)
Write Pythonic Code for Better Data Science (webcast)
Data School
57 Web scraping in Python (Part 1): Getting started
Web scraping in Python (Part 1): Getting started
Data School
58 Web scraping in Python (Part 2): Parsing HTML with Beautiful Soup
Web scraping in Python (Part 2): Parsing HTML with Beautiful Soup
Data School
Web scraping in Python (Part 3): Building a dataset
Web scraping in Python (Part 3): Building a dataset
Data School
60 Web scraping in Python (Part 4): Exporting a CSV with pandas
Web scraping in Python (Part 4): Exporting a CSV with pandas
Data School
OpenClaw + Gemma 4 = 100% FREE AI Agents
Next Up
OpenClaw + Gemma 4 = 100% FREE AI Agents
Julian Goldie SEO