PySpark: Apply & Evaluate Predictive ML Models

External: Coursera Courses ↗ · Coursera

Open Course on External: Coursera

Free to audit · Opens on External: Coursera

PySpark: Apply & Evaluate Predictive ML Models

Coursera · Intermediate ·📊 Data Analytics & Business Intelligence ·3mo ago
This intermediate-level course empowers learners to apply, analyze, and evaluate machine learning models using Apache PySpark’s distributed computing framework. Designed for data professionals familiar with Python and basic ML concepts, the course explores real-world implementation of both regression and classification techniques, along with unsupervised clustering. In Module 1, learners will construct linear and generalized regression models, apply ensemble regressors such as Random Forests, and evaluate predictive performance using metrics like RMSE and R-squared. The module concludes with an in-depth look at logistic regression for binary classification tasks. Module 2 builds on these foundations to cover multi-class classification using multinomial logistic regression and decision trees. Learners will also evaluate ensemble models like Random Forests for robust classification, and explore K-Means clustering for unsupervised learning problems. Each concept is reinforced with guided PySpark code demonstrations, predictive workflows, and practical evaluations using large datasets. By the end of the course, learners will be able to design, execute, and critically assess machine learning models in PySpark for scalable data analytics solutions.

What You'll Learn

Applies and evaluates predictive machine learning models using Apache PySpark

Watch on External: Coursera ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

The Nervous System of the Telco: Unlocking the Real-Time Power of the Network Element Interfaces…
Unlock the power of network element interfaces to enable real-time insights in telco operations
Medium · Data Science
Enhanced RFM Analysis for Customer Segmentation using K-Prototypes
Learn how to enhance RFM analysis for customer segmentation using K-Prototypes, a clustering algorithm that handles categorical and numerical data, to improve marketing strategies and customer targeting.
Medium · Machine Learning
One Survey Asked Rich People Ten Times More Often Than Poor People.
Learn how a biased survey sample can impact data analysis and decision-making, and why it's crucial to ensure representative sampling in data science
Medium · Data Science
Beyond the Credit Score: What 1.3 Million Loans Reveal About Who Actually Repays
Analyzing 1.3 million loans reveals new insights on who repays, challenging traditional credit scoring methods
Medium · Data Science
Up next
Spreadsheet Guy Meets the CFO: "Define How Much"
Digital Transformation with Eric Kimberling
Watch →