Course

Skills

Scaling scikit-learn Solutions

by Janani Ravi

This course covers the important considerations for scikit-learn models in improving prediction latency and throughput; specific feature representation and partial learning techniques, as well as implementations of incremental learning, out-of-core learning, and multicore parallelism.

Preview this course

Try for free

Get this course plus top-rated picks in tech skills and other popular topics.

$29.00

per month after 10 day trial

Your 10 day Standard free trial includes

Expert-led courses

Keep up with the pace of change with thousands of expert-led, in-depth courses.

For teams

Give up to 50 users access to our full library including this course free for 30 days

Course info

Rating

(17)

Level

Advanced

Updated

Jan 31, 2020

Duration

2h 54m

What you'll learn

Even as the number of machine learning frameworks and libraries increases rapidly, scikit-learn is retaining its popularity with ease. scikit-learn makes the common use-cases in machine learning - clustering, classification, dimensionality reduction and regression - incredibly easy.

In this course, Scaling scikit-learn Solutions you will gain the ability to leverage out-of-core learning and multicore parallelism in scikit-learn.

First, you will learn considerations that affect latency and throughput in prediction, including the number of features, feature complexity, and model complexity.

Next, you will discover how smart choices in feature representation and in how you model sparse data can improve the scalability of your models. You will then understand what incremental learning is, and how to use scikit-learn estimators that support this key enabler of out-of-core learning.

Finally, you will round out your knowledge by parallelizing key tasks such as cross-validation, hyperparameter tuning, and ensemble learning.

When you’re finished with this course, you will have the skills and knowledge to identify key techniques to help make your model scalable and implement them appropriately for your use-case.

Course Overview

1min

Course Overview 2m

Understanding Strategies for Computational Scaling

33mins

Observing the Factors Affecting Prediction Latency

47mins

Module Overview 1m
Demo: Measuring Bulk and Atomic Prediction Latencies for Different Models 7m
Demo: Influence of Number of Features on Bulk Prediction Latency 5m
Optimizations to Improve Prediction Latency 7m
Optimizations to Improve Prediction Throughput 2m
Demo: Observing the Influence of Model Complexity 8m
Demo: Using Optimized Libraries and Reducing Validation Overhead 3m
Demo: Training Models Using Dense and Sparse Input Representation 6m
Demo: Prediction with Sparse Data and Memory Profiling 6m
Module Summary 1m

Implementing Scaling of Instances Using Out-of-core Learning

33mins

Module Overview 1m
Streaming Data 4m
Incremental Learning for Large Datasets 7m
Demo: Preparing Text Data for out of Core Learning 6m
Demo: Using Partial Fit to Perform out of Core Learning 5m
Demo: Visualizing Latencies and Accuracies 5m
Demo: Using the Passive Aggressive, Perceptron, and BernoulliNB Classifiers 4m
Module Summary 1m

Implementing Multicore Parallelism in scikit-learn

37mins

Module Overview 1m
Parallelizing Computation Using Joblib 5m
Demo: Introducing Joblib 4m
Demo: Running Concurrent Workers Using Joblib 5m
Demo: Cross Validation Using Concurrent Workers 4m
Demo: Integrating Joblib with Dask ML 3m
Demo: Grid Search with Concurrent Workers 3m
Demo: Preparing Data for Multi-label Classification 8m
Demo: Performing Multi-label Classification 4m
Module Summary 1m

Autoscaling of scikit-learn with Apache Spark

19mins

Module Overview 1m
Integrating Apache Spark and scikit-learn 5m
Demo: Working with Spark Using spark-sklearn 7m
Demo: Working with Spark Using scikit-spark 5m
Summary and Further Study 2m

About the author

Janani Ravi

Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework. After spending years working in tech in the Bay Area, New York, and Singapore at companies such as Microsoft, Google, and Flipkart, Janani finally decided to combine her love for technology with her passion for teaching. She is now the co-founder of Loonycorn, a content studio focused on providing ... more

See more courses by Janani Ravi

Try for free

Get this course plus top-rated picks in tech skills and other popular topics.

$29.00

per month after 10 day trial

Your 10 day Standard free trial includes

Expert-led courses

Keep up with the pace of change with thousands of expert-led, in-depth courses.

For teams

Give up to 50 users access to our full library including this course free for 30 days

Course info

Rating

(17)

Level

Advanced

Updated

Jan 31, 2020

Duration

2h 54m

Ready to upskill? Get started

Contact Sales

Scaling scikit-learn Solutions

What you'll learn

Table of contents

About the author

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Contact Sales

Scaling scikit-learn Solutions

What you'll learn

Table of contents

About the author

Get access now

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Ready to skill up
your entire team?

Ready to skill up
your entire team?