Big Data Analysis with Scala and Spark

External: Coursera Courses ↗ · Coursera

Open Course on External: Coursera

Free to audit · Opens on External: Coursera

Big Data Analysis with Scala and Spark

Coursera · Beginner ·🌐 Frontend Engineering ·3mo ago

Skills: ML Pipelines90%RAG Basics60%

Key Takeaways

Covers big data analysis using Scala and Spark, including functional concepts and the data parallel paradigm

Original Description

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written in Scala. In this course, we'll see how the data parallel paradigm can be extended to the distributed case, using Spark throughout. We'll cover Spark's programming model in detail, being careful to understand how and when it differs from familiar programming models, like shared-memory parallel collections or sequential Scala collections. Through hands-on examples in Spark and Scala, we'll learn when important issues related to distribution like latency and network communication should be considered and how they can be addressed effectively for improved performance. Learning Outcomes. By the end of this course you will be able to: - read data from persistent storage and load it into Apache Spark, - manipulate data with Spark and Scala, - express algorithms for data analysis in a functional style, - recognize how to avoid shuffles and recomputation in Spark, Recommended background: You should have at least one year programming experience. Proficiency with Java or C# is ideal, but experience with other languages such as C/C++, Python, Javascript or Ruby is also sufficient. You should have some familiarity using the command line. This course is intended to be taken after Parallel Programming: https://www.coursera.org/learn/parprog1.

Watch on External: Coursera ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: ML Pipelines

View skill →

Building a Dog Breed Identifier App from scratch - DogNet

Building a Dog Breed Identifier App from scratch - DogNet

Aladdin Persson

Complete Dockers For Data Science Tutorial In One Shot

Complete Dockers For Data Science Tutorial In One Shot

Part 6 | Deploy ML Model on Kubernetes | Auto-Scaling with HPA and Monitoring with Prometheus

Part 6 | Deploy ML Model on Kubernetes | Auto-Scaling with HPA and Monitoring with Prometheus

Abonia Sojasingarayar

MLOps Tutorial: Build a Full ML Pipeline with MLflow, DVC & Deploy on AWS

MLOps Tutorial: Build a Full ML Pipeline with MLflow, DVC & Deploy on AWS

Analytics Vidhya

Vertex Pipelines: Qwik Start

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Related Reads

I Spent Two Years Maintaining a React SPA. HTMX Rebuilt It in a Week

Learn how HTMX rebuilt a React SPA in a week, replacing 2 years of maintenance work, and discover the benefits of this alternative approach

Medium · Programming

The 5 Levels of Front End Engineering (And Where Most Developers Get Stuck)

Learn the 5 levels of front end engineering to improve your skills and avoid getting stuck in a career rut

Medium · Programming

Browser-Based PDF Editing with Vue 3 and pdf-lib

Learn to build a browser-based PDF editor using Vue 3 and pdf-lib, enabling users to edit PDFs directly in the browser

Dev.to · sunshey

Say Goodbye To Electron?

Learn about a new approach to building native applications without Electron, using frontend-style development

Medium · Programming

How To Build A Twitter Clone - React Next JS - Appwrite Crash Course