This INCREDIBLE trick will speed up your data processes.
In this video we discuss the best way to save off data as files using python and pandas. When you are working with large datasets there comes a time when you need to store your data. Most people turn to CSV files because they are easy to share and universally used. But there are much better options out there! Watch as Rob Mulla, Kaggle grandmaster, discusses some alternative ways of saving data files: pickle, parquet and feather files. I run some benchmarks to show that you can save time, space and keep the important metadata about your files in the process!
Timeline
00:00 Intro
00:49 Creating our Data
02:08 CSVs
04:39 Setting dtypes for CSVs
06:15 Pickle Files
07:16 Parquet ❤️
09:07 Feather
10:31 Other Options
11:02 Benchmarking
12:19 Takeaways
12:43 Outro
Code Gist: https://gist.github.com/RobMulla/738491f7bf7cfe79168c7e55c622efa5
Follow me on twitch for live coding streams: https://www.twitch.tv/medallionstallion_
Other Videos:
Speed up Pandas: https://www.youtube.com/watch?v=SAFmrTnEHLg
Efficient Pandas Dataframes: https://www.youtube.com/watch?v=u4_c2LDi4b8
Inroduction to Pandas: https://www.youtube.com/watch?v=_Eb0utIRdkw
Exploritory Data Analysis Video: https://www.youtube.com/watch?v=xi0vhXFPegw
Audio Data in Python: https://www.youtube.com/watch?v=ZqpSb5p1xQo
Image Data in Python: https://www.youtube.com/watch?v=kSqxn6zGE0c
* Youtube: https://youtube.com/@robmulla?sub_confirmation=1
* Discord: https://discord.gg/HZszek7DQc
* Twitch: https://www.twitch.tv/medallionstallion_
* Twitter: https://twitter.com/Rob_Mulla
* Kaggle: https://www.kaggle.com/robikscube
#python #code #datascience #pandas
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Rob Mulla · Rob Mulla · 12 of 60
1
2
3
4
5
6
7
8
9
10
11
▶
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A Gentle Introduction to Pandas Data Analysis (on Kaggle)
Rob Mulla
Exploratory Data Analysis with Pandas Python
Rob Mulla
7 Python Data Visualization Libraries in 15 minutes
Rob Mulla
Kaggle competition starter notebook walkthrough
Rob Mulla
Kaggle Competitions: A Beginner's Guide to Winning
Rob Mulla
Jupyter Notebook Complete Beginner Guide - From Jupyter to Jupyterlab, Google Colab and Kaggle!
Rob Mulla
Audio Data Processing in Python
Rob Mulla
Complete Data Science Project!
Rob Mulla
Make Your Pandas Code Lightning Fast
Rob Mulla
Image Processing with OpenCV and Python
Rob Mulla
Speed Up Your Pandas Dataframes
Rob Mulla
This INCREDIBLE trick will speed up your data processes.
Rob Mulla
Complete Guide to Cross Validation
Rob Mulla
Easy Python Progress Bars with tqdm
Rob Mulla
Economic Data Analysis Project with Python Pandas - Data scraping, cleaning and exploration!
Rob Mulla
Python Sentiment Analysis Project with NLTK and 🤗 Transformers. Classify Amazon Reviews!!
Rob Mulla
Get Started with Machine Learning and AI in 2023
Rob Mulla
The Trick to Get Unlimited Datasets
Rob Mulla
Video Data Processing with Python and OpenCV
Rob Mulla
Object Detection in 10 minutes with YOLOv5 & Python!
Rob Mulla
Pandas for Data Science #shorts
Rob Mulla
Object Detection in 60 Seconds using Python and YOLOv5 #shorts
Rob Mulla
Machine Learning for Facial Recognition in Python in 60 Seconds #shorts
Rob Mulla
Time Series Forecasting with XGBoost - Use python and machine learning to predict energy consumption
Rob Mulla
Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr
Rob Mulla
Solving an Impossible Riddle with Code
Rob Mulla
Do these Pandas Alternatives actually work?
Rob Mulla
Time Series Forecasting with XGBoost - Advanced Methods
Rob Mulla
Data Science Uncut - Data Shootout Kaggle Competition (Aug 1 2022 Stream)
Rob Mulla
Kaggle Dataset Creation from Scratch- Data Science Uncut (Aug 10 2022)
Rob Mulla
Chess Board Computer Vision AI - Data Science Uncut (Sep 7, 2022)
Rob Mulla
25 Nooby Pandas Coding Mistakes You Should NEVER make.
Rob Mulla
DEFCON Hacking AI CTF Solution on Kaggle - Data Science Uncut Sep 11, 2022
Rob Mulla
More Chessboard Computer Vision AI - Data Science Uncut - Sep 13
Rob Mulla
Medallion Data Science Live Stream
Rob Mulla
Community Kaggle Competition Overview - Corn Classification (
Rob Mulla
Deep Learning Image Classification - Corn Kernels - Data Science Uncut
Rob Mulla
OpenAI Whisper Demo: Convert Speech to Text in Python
Rob Mulla
Yolov7 Custom Object Detection in Python Tutorial - Chess Piece Detection
Rob Mulla
Live Kaggle Coding - Enzyme Stability Prediction - Data Science Uncut Sep, 27 2022
Rob Mulla
Finding Chess Cheaters with Python! - Data Science Uncut Livestream
Rob Mulla
Data Science Uncut - Kaggle Community Competition & Chess Data Analysis - Oct 4, 2022
Rob Mulla
Flight Delay Dataset Creation (Data Science Uncut)
Rob Mulla
5 Reasons to Kaggle #shorts
Rob Mulla
♟️ Data Science - Chess Data Analysis
Rob Mulla
EXTREME PYTHON & DATA SCIENCE LIVE STREAM
Rob Mulla
What is Clustering in ML?
Rob Mulla
What is K-Nearest Neighbors?
Rob Mulla
LIVE CODING: Flight Data Exploration with Pandas & Python
Rob Mulla
Kaggle Survey vs. Twitter Sentiment
Rob Mulla
If Top Chess.com Players were STOCKS - Live Coding Data Anaylsis Stream
Rob Mulla
Data Visualization BATTLE!
Rob Mulla
LIVE CODING: Stocks & Sentiment Analysis
Rob Mulla
Progress Bar in Python with TQDM
Rob Mulla
Flight Cancellation Data Analysis
Rob Mulla
Synthetic Dataset Creation for Machine Learning - Blender and Python
Rob Mulla
The Ultimate Coding Setup for Data Science
Rob Mulla
Dataset Creation SPEED RUN - Live Coding With Python & Pandas
Rob Mulla
Data Wrangling with Python and Pandas LIVE
Rob Mulla
Forecasting with the FB Prophet Model
Rob Mulla
More on: Python for Data
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
I Tried to Find Out How Close I Am to the CEO of Roblox. The Answer Was Three.
Medium · Data Science
The Dying Symphony of Nature :
How climate change silences Cultures, Species, and Nature.
Medium · Data Science
Student Mental Health Analytics: An Interactive Dashboard in R Shiny
Medium · Data Science
Building a US choropleth in Python with plotly express, using a real fragrance dataset
Dev.to · ahmad-khan-97
Chapters (11)
Intro
0:49
Creating our Data
2:08
CSVs
4:39
Setting dtypes for CSVs
6:15
Pickle Files
7:16
Parquet ❤️
9:07
Feather
10:31
Other Options
11:02
Benchmarking
12:19
Takeaways
12:43
Outro
🎓
Tutor Explanation
DeepCamp AI