Data
Data Engineering
ETL pipelines, data warehousing, streaming, orchestration and lakehouse architecture
Skills in this topic
5 skills — Sign in to track your progress
ETL Basics
beginner
Write a Python ETL pipeline with pandas
Workflow Orchestration
intermediate
Build a DAG in Airflow with sensors and operators
Streaming Data
intermediate
Produce and consume Kafka topics with Python
Data Warehousing
intermediate
Model a star schema with dbt
Lakehouse Architecture
advanced
Manage ACID transactions on a data lake with Delta Lake
Reddit r/learnprogramming
🔄 Data Engineering
⚡ AI Lesson
1w ago
I’m looking for advice from people who have handled very large Excel/CSV imports in production systems.
Current requirement from my client: Upload 3 Excel sheets One sheet contains 150k+ rows Another contains 40k+ rows Data needs to be inserted into multiple relat

Medium · AI
🔄 Data Engineering
⚡ AI Lesson
3w ago
Why We Let AI Design Our ETL Pipelines — but Never Run Them
ETL systems are uncompromisingly literal — and that is precisely why they age poorly. Continue reading on Medium »

Medium · Programming
🔄 Data Engineering
⚡ AI Lesson
1mo ago
The Complete Framework to Design ETL Pipelines in Interviews
A Decision-Tree Approach to Cracking Senior Data Engineering System Design Rounds Continue reading on Towards Data Engineering »

Medium · Programming
🔄 Data Engineering
⚡ AI Lesson
1mo ago
Building a High-Throughput ETL System in Python
How I Combined Pandas, Dask, and SQLAlchemy for Speed and Reliability Continue reading on Top Python Libraries »
Medium · AI
🔄 Data Engineering
⚡ AI Lesson
1mo ago
The Data Engineering Part 2: Building Your First Production Data Pipeline
From raw data to real-time dashboards — a hands-on walkthrough of modern pipeline architecture using Kafka, Spark, dbt, and Airflow, plus… Continue reading on M
DeepCamp AI