Accelerating Pandas with NVIDIA's cuDF: Basic Statistical Analysis and Data Cleaning (Ep. 06)

Python Tutorials for Digital Humanities · Beginner ·📰 AI News & Updates ·10mo ago
In this episode, we continue our series on NVIDIA's cuDF, a CUDA-accelerated version of Pandas. We'll focus on performing basic statistical analysis on a large dataset of 4.3 million newspaper articles, demonstrating the advantages of GPU acceleration. By comparing CPU and GPU performance, we showcase how tasks like word counts and text length calculations can be sped up dramatically using the NVIDIA RTX 5000 GPU. Additionally, we'll walk through essential data cleaning techniques to improve data quality. 00:00 Introduction to QDF and Video Overview 00:47 Exciting Hardware Setup for the Serie…
Watch on YouTube ↗ (saves to browser)

Chapters (7)

Introduction to QDF and Video Overview
0:47 Exciting Hardware Setup for the Series
2:06 Loading and Preparing the Dataset
3:55 Performing Statistical Analysis on CPU
5:20 Accelerating Analysis with GPU
8:54 Identifying and Cleaning Bad Data
14:31 Conclusion and Next Steps
Real talk with Deborah Ko
Next Up
Real talk with Deborah Ko
Google Ads