Data Management with Databricks: Big Data with Delta Lakes

Coursera Courses ↗ · Coursera

Open Course on Coursera

Free to audit · Opens on Coursera

Data Management with Databricks: Big Data with Delta Lakes

Coursera · Intermediate ·📊 Data Analytics & Business Intelligence ·1mo ago
In this 2-hour guided project, "Data Management with Databricks: Big Data with Delta Lakes" you will collaborate with the instructor to achieve the following objectives: 1-Create Delta Tables in Databricks and write data to them. Gain hands-on experience in setting up and managing Delta Tables, a powerful data storage format optimized for performance and reliability. 2-Transform a Delta table using Python and leverage SQL to query the data for creating a comprehensive dashboard. Learn how to apply Python-based transformations to Delta Tables, and use SQL queries to extract the necessary insights for building a Supply Chain dashboard. 3-Utilize Delta Lake's merge operation and version control capabilities to efficiently update Delta Tables. Explore the capabilities of Delta Lake's merge operation to perform upserts and other data updates efficiently. Additionally, learn how to leverage Delta Lake's built-in version control to track and access previous versions of Delta Tables as needed. Throughout a real-world business scenario, you will use Databricks to build an end-to-end data pipeline that integrates various JSON data files and applies transformations, ultimately providing valuable insights and analysis-ready data. This intermediate-level guided project is designed for data engineers who build data pipelines for their companies using Databricks. In order to be successful in this guided project, you need prior knowledge of writing Python scripts including importing libraries, setting-up variables, manipulating data frames, and using functions. You will also need to be familiar with writing SQL queries such as aggregating, filtering, and joining tables.
Watch on Coursera ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

I Tried to Find Out How Close I Am to the CEO of Roblox. The Answer Was Three.
You can calculate your distance to a CEO on social media using graph theory, revealing surprising connectivity
Medium · Data Science
The Dying Symphony of Nature : How climate change silences Cultures, Species, and Nature.
Climate change affects not only species but also cultures and nature, leading to a loss of biodiversity and cultural heritage
Medium · Data Science
Student Mental Health Analytics: An Interactive Dashboard in R Shiny
Create an interactive dashboard in R Shiny to analyze student mental health data and inform support strategies
Medium · Data Science
Building a US choropleth in Python with plotly express, using a real fragrance dataset
Learn to build a US choropleth map in Python using Plotly Express and a real fragrance dataset to visualize geographic data effectively
Dev.to · ahmad-khan-97
Up next
Data is hungry for context
DeepLearningAI
Watch →