Scaling scikit-learn Solutions

by Janani Ravi

This course covers the important considerations for scikit-learn models in improving prediction latency and throughput; specific feature representation and partial learning techniques, as well as implementations of incremental learning, out-of-core learning, and multicore parallelism.

What you'll learn

Even as the number of machine learning frameworks and libraries increases rapidly, scikit-learn is retaining its popularity with ease. scikit-learn makes the common use-cases in machine learning - clustering, classification, dimensionality reduction and regression - incredibly easy.

In this course, Scaling scikit-learn Solutions you will gain the ability to leverage out-of-core learning and multicore parallelism in scikit-learn.

First, you will learn considerations that affect latency and throughput in prediction, including the number of features, feature complexity, and model complexity.

Next, you will discover how smart choices in feature representation and in how you model sparse data can improve the scalability of your models. You will then understand what incremental learning is, and how to use scikit-learn estimators that support this key enabler of out-of-core learning.

Finally, you will round out your knowledge by parallelizing key tasks such as cross-validation, hyperparameter tuning, and ensemble learning.

When you’re finished with this course, you will have the skills and knowledge to identify key techniques to help make your model scalable and implement them appropriately for your use-case.

Table of contents

Course Overview
2mins

About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

Ready to upskill? Get started