Columnar Storage and Query Optimization

External: Coursera Courses ↗ · Coursera

Open Course on External: Coursera

Free to audit · Opens on External: Coursera

Columnar Storage and Query Optimization

Coursera · Beginner ·📊 Data Analytics & Business Intelligence ·2mo ago
Every data professional writes SQL queries — but few understand why some queries take seconds and others take minutes on the same data. The answer lies beneath the surface: in how data is stored, how query engines read that data, and how columnar formats like Parquet fundamentally change the game for analytics performance. This course gives you that understanding. You will start from the foundations — how computers store and read data, how SQL operations access data internally, and what distinguishes row-based storage from columnar storage. From there, you will explore modern columnar formats (Parquet, ORC), work with DuckDB as your primary analytics query engine, and learn to read execution plans to diagnose exactly where queries slow down. Each concept is reinforced through hands-on demonstrations that you can follow along on your own setup. By the end of this course, you’ll be able to: - Explain how computers store data, distinguish between row-based and columnar storage, and identify when columnar formats provide a performance advantage. - Work with Parquet and ORC file formats, compare them to CSV, and query columnar data using DuckDB. - Read and interpret SQL query execution plans using EXPLAIN, and diagnose performance bottlenecks in analytical workloads. - Apply real-world query optimization techniques including column pruning, filter pushdown, partitioning, data skipping, and before-vs-after performance comparison. This course is designed for a diverse audience: Data Analysts who want to understand why their queries are slow, junior Data Engineers building foundational storage knowledge, BI Professionals moving into performance engineering or platform roles, and SQL Developers who want to go beyond writing queries to understanding how queries execute internally. Basic computer literacy is helpful. No prior SQL experience is required — though familiarity with basic statements will help you move faster. Stop guessing why queries are slow. Start understa
Watch on External: Coursera ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Launch Your Career with a Data Analyst Course: The Smart Choice for Future Professionals
Learn how a data analyst course can launch your career in a data-driven world and why it's a smart choice for future professionals
Medium · Data Science
EDA
Learn to perform Exploratory Data Analysis (EDA) on the Iris Dataset using Python to uncover insights and patterns in data
Medium · Python
The Overloaded Equals Sign
Learn how the equals sign can be misleading in mathematics and how it relates to ignoring certain aspects of equations, with applications in data science and analytics.
Medium · Data Science
Journal Figure Replication | Plotting SHAP Interaction Matrix Heatmaps with Python
Learn to replicate journal figures by plotting SHAP interaction matrix heatmaps with Python for better data analysis and visualization
Medium · Python
Up next
Massive Internship at Adidas! ₹70,000 Stipend for MBA Freshers 🚀
hackathonwalebhaiya
Watch →