Data Storage and Queries

External: Coursera Courses ↗ · Coursera

Open Course on External: Coursera

Free to audit · Opens on External: Coursera

Data Storage and Queries

Coursera · Beginner ·🔄 Data Engineering ·3mo ago

Key Takeaways

Develops a Christian formation course using Methodist doctrine and practices

Original Description

In this course, you will learn about the raw ingredients and processes that are used to physically store data on disk and in memory. You’ll explore different storage systems, including object, block, and file storage, as well as databases, that are built on top of these raw ingredients. You’ll also get a chance to use the Cypher language to query a Neo4j graph database, and perform vector similarity search, a key feature behind generative AI and large language models. You will explore the evolution of data storage abstractions, from data warehouses, to data lakes, and data lakehouses, while comparing the advantages and drawbacks of each architectural paradigm. With hands-on practice, you will design a simple data lake using Amazon Glue, and build a data lakehouse using AWS LakeFormation and Apache Iceberg. In the last week of this course, you’ll see how queries work behind the scenes, practice writing more advanced SQL queries, compare the query performance in row vs column-oriented storage, and perform streaming queries using Apache Flink.
Watch on External: Coursera ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

How I built the OSS alternatives directory: GitHub ETL, Turso, and the UPSERT trap I hit
Learn how to build a data pipeline for an open-source alternatives directory using GitHub ETL, Turso, and Claude Haiku summaries
Dev.to · MORINAGA
Apache Iceberg in Production: Compaction, Catalogs, and the Pitfalls Nobody Warns You About
Learn how to use Apache Iceberg in production, including compaction, catalogs, and common pitfalls to avoid, to improve data engineering workflows
Dev.to · Gabriel Henrique
Your First Task as a Data Engineer in a New Company? Make the ETL Pipeline Testable
As a new data engineer, make the ETL pipeline testable to ensure data quality and reliability
Towards Data Science
From DataStage and Informatica to Databricks Medallion Architecture: Why Migration Is More Than Code Conversion
Learn how to migrate legacy ETL systems like DataStage to modern architectures like Databricks Medallion, and why it's more than just code conversion
Dev.to · Amit Kumar Singh
Up next
A Moment Frozen in Time | Arnav Iyengar | TEDxJenks Youth
TEDx Talks
Watch →