Building Machine Learning Models on Databricks
This course will teach you how you can build and train your traditional machine learning models using the Databricks Machine Learning runtime and MLflow to manage the end-to-end machine learning lifecycle.
What you'll learn
Training, evaluating, and deploying machine learning models are now routine in many organizations, and having the right environment around this process is often what sets a company apart from its competitors. The Databricks Machine Learning Runtime, along with MLFlow, manages your experiment's runs, and models make training and hyperparameter tuning of your models simple and intuitive.
In this course, Building Machine Learning Models on Databricks, you will learn to build and train regression and classification models using the scikit-learn framework.
First, you will load, explore, and process your data using Databricks notebooks and you will use Bamboolib for no-code data analysis and transformations.
Next, you will create experiments and track your model’s parameters and metrics using runs, and compare runs using the MLflow UI.
After that, you will build and train regression and classification models using gradient boosting algorithms which are part of the XGBoost framework. You will also productionize and serve your models using Classic MLFlow Model Serving and perform real-time inference using your deployed models.
Finally, you will learn how you can use the Hyperopt tool for hyperparameter tuning of your models, as well as running hyperparameter tuning in a distributed fashion on a Spark cluster using the SparkTrials class.
When you are finished with this course, you will have the skills and knowledge to build and train traditional machine learning models on Databricks using MLflow to manage your machine learning workflow.
Table of contents
- A Quick Overview of scikit-learn 3m
- Demo: Loading, Exploring, and Preprocessing Data 4m
- Demo: Creating an Experiment and Run 4m
- Demo: Autologging to Track Model Metrics 8m
- Demo: Creating Multiple Runs and Comparing Runs 5m
- Demo: Using Loaded Model for Predictions 3m
- Demo: Using Bamboolib for Data Exploration and Transformation 6m
- Demo: Autologging to Track Metrics for a Classification Model 5m
- Demo: Registering Models and Managing Stage Transitions 5m
- Demo: Classic Inferencing Using a REST Endpoint 7m
- An Overview of XGBoost 5m
- Demo: Inferring Model Signature and Logging Models 6m
- Demo: Autologging XGBoost Model Runs 6m
- Machine Learning Using Apache Spark 3m
- Demo: Loading Data into a Delta Table 3m
- Demo: Training a Model Using a Spark ML Pipeline 8m
- Demo: Training an XGBoost Model Using a Spark Pipeline 2m
- Demo: Using Cross Validation to Find the Best Model Hyperparameters 5m