Data

Data Engineering

ETL pipelines, data warehousing, streaming, orchestration and lakehouse architecture

55
lessons
Skills in this topic
View full skill map →
ETL Basics
beginner
Write a Python ETL pipeline with pandas
Workflow Orchestration
intermediate
Build a DAG in Airflow with sensors and operators
Streaming Data
intermediate
Produce and consume Kafka topics with Python
Data Warehousing
intermediate
Model a star schema with dbt
Lakehouse Architecture
advanced
Manage ACID transactions on a data lake with Delta Lake
All Reads (23) Articles (11)Blog Posts (7)Tutorials (5)
Data Engineering Terminology Made Easy for Data Scientists, ML Engineers, and Anyone Who’s Ever…
Medium · Machine Learning 🔄 Data Engineering ⚡ AI Lesson 2w ago
Data Engineering Terminology Made Easy for Data Scientists, ML Engineers, and Anyone Who’s Ever…
Nobody Told You What Any of This Means in simple language — Until Now Continue reading on Medium »
From Experimental Notebooks to Production: A Data Engineer’s perspective of Scaling Data Science…
Medium · Machine Learning 🔄 Data Engineering ⚡ AI Lesson 3w ago
From Experimental Notebooks to Production: A Data Engineer’s perspective of Scaling Data Science…
Originally published at harishkesavarao.github.io Continue reading on Medium »
What I Learnt Implementing a Medallion Architecture from Scratch on Databricks Using Washington…
Medium · Machine Learning 🔄 Data Engineering ⚡ AI Lesson 3w ago
What I Learnt Implementing a Medallion Architecture from Scratch on Databricks Using Washington…
9 hard-won lessons from a real end-to-end lakehouse build, EVLytics Continue reading on Towards Data Engineering »
What I Learnt Implementing a Medallion Architecture from Scratch on Databricks Using Washington…
Medium · Data Science 🔄 Data Engineering ⚡ AI Lesson 3w ago
What I Learnt Implementing a Medallion Architecture from Scratch on Databricks Using Washington…
9 hard-won lessons from a real end-to-end lakehouse build, EVLytics Continue reading on Towards Data Engineering »
The Complete Framework to Design ETL Pipelines in Interviews
Medium · Data Science 🔄 Data Engineering ⚡ AI Lesson 1mo ago
The Complete Framework to Design ETL Pipelines in Interviews
A Decision-Tree Approach to Cracking Senior Data Engineering System Design Rounds Continue reading on Towards Data Engineering »
The Hidden Complexity of Data Engineering in Regulated Industries (And What It Taught Me About…
Medium · Python 🔄 Data Engineering ⚡ AI Lesson 1mo ago
The Hidden Complexity of Data Engineering in Regulated Industries (And What It Taught Me About…
Strict data formats and compliance requirements teach you more about clean software design than any course. Here is what working in a… Continue reading on Level
I Stopped Fixing Broken Parsers at 3 AM , Here’s How We Outsourced Our DOM Extraction
Medium · Python 🔄 Data Engineering ⚡ AI Lesson 1mo ago
I Stopped Fixing Broken Parsers at 3 AM , Here’s How We Outsourced Our DOM Extraction
It’s 3:00 AM on a Tuesday. Your PagerDuty alert is ringing. Continue reading on Medium »
Modernizing Data Ingestion: An Async PostgreSQL Pipeline with Psycopg 3
Medium · DevOps 🔄 Data Engineering ⚡ AI Lesson 1mo ago
Modernizing Data Ingestion: An Async PostgreSQL Pipeline with Psycopg 3
Orchestrating high-performance migrations using asynchronous architectures and memory-safe processing. Continue reading on Medium »
Medium · Machine Learning 🔄 Data Engineering ⚡ AI Lesson 1mo ago
The Data Engineering Part 2: Building Your First Production Data Pipeline
From raw data to real-time dashboards — a hands-on walkthrough of modern pipeline architecture using Kafka, Spark, dbt, and Airflow, plus… Continue reading on M
Medium · Data Science 🔄 Data Engineering ⚡ AI Lesson 1mo ago
The Data Engineering Part 2: Building Your First Production Data Pipeline
From raw data to real-time dashboards — a hands-on walkthrough of modern pipeline architecture using Kafka, Spark, dbt, and Airflow, plus… Continue reading on M
Construindo um Lakehouse Resiliente na AWS 
com Terraform e Arquitetura Medallion
Medium · DevOps 🔄 Data Engineering ⚡ AI Lesson 1mo ago
Construindo um Lakehouse Resiliente na AWS com Terraform e Arquitetura Medallion
Na Engenharia de Dados moderna, o sucesso de um projeto não é medido apenas pela eficácia de um script Python. O verdadeiro diferencial… Continue reading on Med