Building Machine Learning Solutions with scikit-learn

Paths

Building Machine Learning Solutions with scikit-learn

Author: Janani Ravi

This skill teaches learners how to build machine learning solutions using Python and the pervasive scikit-learn package, including application of classifcation, regression, and clustering.

What you will learn:

  • Design and implement common machine learning solutions using scikit-learn
  • Evaluation and validation of scikit-learn machine learning models

Pre-requisites

  • Data Literacy
  • Data Analytics Literacy
  • Statistics
  • Python programming

Beginner

Experience the machine learning workflow as implemented in scikit-learn, and use that workflow to build simple classification, regression, and clustering models.

Building Your First scikit-learn Solution

by Janani Ravi

May 2, 2019 / 2h 7m

2h 7m

Start Course
Description

Even as the number of machine learning frameworks and libraries increases on a daily basis, scikit-learn is retaining its popularity with ease. scikit-learn makes the common use cases in machine learning - clustering, classification, dimensionality reduction, and regression - incredibly easy. In this course, Building Your First scikit-learn Solution, you'll gain the ability to identify the situations where scikit-learn is exactly the tool you are looking for, and also those situations where you need something else. First, you'll learn how scikit-learn’s niche is traditional machine learning, as opposed to deep learning or building neural networks. Next, you'll discover how seamlessly it integrates with core Python libraries. Then, you'll explore the typical set of steps needed to work with models in scikit-learn. Finally, you'll round out your knowledge by building your first scikit-learn regression and classification models. When you’re finished with this course, you'll have the skills and knowledge to identify precisely the situations when scikit-learn ought to be your tool of choice, and also how best to leverage the formidable capabilities of scikit-learn.

Table of contents
  1. Course Overview
  2. Exploring scikit-learn for Machine Learning
  3. Understanding the Machine Learning Workflow with scikit-learn
  4. Building a Simple Machine Learning Model with scikit-learn

Building Classification Models with scikit-learn

by Janani Ravi

Jun 28, 2019 / 2h 34m

2h 34m

Start Course
Description

Perhaps the most ground-breaking advances in machine learning have come from applying machine learning to classification problems.

In this course, Building Classification Models with scikit-learn you will gain the ability to enumerate the different types of classification algorithms and correctly implement them in scikit-learn.

First, you will learn what classification seeks to achieve, and how to evaluate classifiers using accuracy, precision, recall, and ROC curves.

Next, you will discover how to implement various classification techniques such as logistic regression, and Naive Bayes classification.

You will then understand other more advanced forms of classification, including those using Support Vector Machines, Decision Trees and Stochastic Gradient Descent.

Finally, you will round out the course by understanding the hyperparameters that these various classification models possess, and how these can be optimized.

When you’re finished with this course, you will have the skills and knowledge to select the correct classification algorithm based on the problem you are trying to solve, and also implement it correctly using scikit-learn.

Table of contents
  1. Course Overview
  2. Understanding Classification as a Machine Learning Problem
  3. Building a Simple Classification Model
  4. Performing Classification Using Multiple Techniques
  5. Hyperparameter Tuning for Classification Models
  6. Applying Classification Models to Images and Text Data

Building Regression Models with scikit-learn

by Janani Ravi

Jun 28, 2019 / 2h 42m

2h 42m

Start Course
Description

Regression is one of the most widely used modeling techniques and is much beloved by everyone ranging from business professionals to data scientists. Using scikit-learn, you can easily implement virtually every important type of regression with ease.

In this course, Building Regression Models with scikit-learn, you will gain the ability to enumerate the different types of regression algorithms and correctly implement them in scikit-learn.

First, you will learn what regression seeks to achieve, and how the ubiquitous Ordinary Least Squares algorithm works under the hood. Next, you will discover how to implement other techniques that mitigate overfittings such as Lasso, Ridge and Elastic Net regression. You will then understand other more advanced forms of regression, including those using Support Vector Machines, Decision Trees and Stochastic Gradient Descent. Finally, you will round out the course by understanding the hyperparameters that these various regression models possess, and how these can be optimized. When you are finished with this course, you will have the skills and knowledge to select the correct regression algorithm based on the problem you are trying to solve, and also implement it correctly using scikit-learn.

Table of contents
  1. Course Overview
  2. Understanding Linear Regression as a Machine Learning Problem
  3. Building a Simple Linear Model
  4. Building Regularized Regression Models
  5. Performing Regression Using Multiple Techniques
  6. Hyperparameter Tuning for Regression Models

Building Clustering Models with scikit-learn

by Janani Ravi

Apr 24, 2019 / 2h 33m

2h 33m

Start Course
Description

Clustering is an extremely powerful and versatile unsupervised machine learning technique that is especially useful as a precursor to applying supervised learning techniques like classification. In this course, Building Clustering Models with scikit-learn, you will gain the ability to enumerate the different types of clustering algorithms and correctly implement them in scikit-learn. First, you will learn what clustering seeks to achieve, and how the ubiquitous k-means clustering algorithm works under the hood. Next, you will discover how to implement other techniques such as DBScan, mean-shift, and agglomerative clustering. You will then understand the importance of hyperparameter tuning in clustering, such as identifying the correct number of clusters into which your data ought to be partitioned. Finally, you will round out the course by implementing clustering algorithms on image data - an especially common use-case. When you are finished with this course, you will have the skills and knowledge to select the correct clustering algorithm based on the problem you are trying to solve, and also implement it correctly using scikit-learn.

Table of contents
  1. Course Overview
  2. Building a Simple Clustering Model in scikit-learn
  3. Performing Clustering Using Multiple Techniques
  4. Hyperparameter Tuning for Clustering Models
  5. Applying Clustering to Image Data

Intermediate

Build sophisticated neural network models, apply dimension reduction techniques, and combine model approaches.

Building Neural Networks with scikit-learn

by Janani Ravi

Aug 19, 2019 / 1h 56m

1h 56m

Start Course
Description

Even as the number of machine learning frameworks and libraries increases on a daily basis, scikit-learn is retaining its popularity with ease. The one domain where scikit-learn is distinctly behind competing frameworks is in the construction of neural networks for deep learning. In this course, Building Neural Networks with scikit-learn, you will gain the ability to make the best of the support that scikit-learn does provide for deep learning. First, you will learn precisely what gaps exist in scikit-learn’s support for neural networks, as well as how to leverage constructs such as the perceptron and multi-layer perceptrons that are made available in scikit-learn. Next, you will discover how perceptrons are just neurons with step activation, and multi-layer perceptrons are effectively feed-forward neural networks. Then, you'll use scikit-learn estimator objects for neural networks to build regression and classification models, working with numeric, text, and image data. Finally, you will use Restricted Boltzmann Machines to perform dimensionality reduction on data before feeding it into a machine learning model. When you’re finished with this course, you will have the skills and knowledge to leverage every bit of support that scikit-learn currently has to offer for the construction of neural networks.

Table of contents
  1. Course Overview
  2. Introducing Neural Networks in scikit-learn
  3. Implementing Regression and Classification Using Neural Networks in scikit-learn
  4. Implementing Text and Image Classification Using Neural Networks in scikit-learn
  5. Implementing Dimensionality Reduction Using Restricted Boltzmann Machines in scikit-learn

Reducing Dimensions in Data with scikit-learn

by Janani Ravi

Apr 18, 2019 / 2h 29m

2h 29m

Start Course
Description

Dimensionality Reduction is a powerful and versatile machine learning technique that can be used to improve the performance of virtually every ML model. Using dimensionality reduction, you can significantly speed up model training and validation, saving both time and money, as well as greatly reduce the risk of overfitting.

In this course, Reducing Dimensions in Data with scikit-learn, you will gain the ability to design and implement an exhaustive array of feature selection and dimensionality reduction techniques in scikit-learn.

First, you will learn the importance of dimensionality reduction, and understand the pitfalls of working with data of excessively high-dimensionality, often referred to as the curse of dimensionality.

Next, you will discover how to implement feature selection techniques to decide which subset of the existing features we might choose to use, while losing as little information from the original, full dataset as possible.

You will then learn important techniques for reducing dimensionality in linear data. Such techniques, notably Principal Components Analysis and Linear Discriminant Analysis, seek to re-orient the original data using new, optimized axes. The choice of these axes is driven by numeric procedures such as Eigenvalue and Singular Value Decomposition.

You will then move to dealing with manifold data, which is non-linear and often takes the form of swiss rolls and S-curves. Such data presents an illusion of complexity, but is actually easily simplified by unrolling the manifold. Finally, you will explore how to implement a wide variety of manifold learning techniques including multi-dimensional scaling (MDS), isomap, and t-distributed Stochastic Neighbor Embedding (t-SNE). You will round out the course by comparing the results of these manifold unrolling techniques with different datasets, including images of faces and handwritten data.

When you’re finished with this course, you will have the skills and knowledge of Dimensionality Reduction needed to design and implement ways to mitigate the curse of dimensionality in scikit-learn.

Table of contents
  1. Course Overview
  2. Getting Started with Feature Selection in scikit-learn
  3. Dimensionality Reduction in Linear Data
  4. Dimensionality Reduction in Non-linear Data

Employing Ensemble Methods with scikit-learn

by Janani Ravi

Aug 12, 2019 / 2h 15m

2h 15m

Start Course
Description

Even as the number of machine learning frameworks and libraries increases on a daily basis, scikit-learn is retaining its popularity with ease. In particular, scikit-learn features extremely comprehensive support for ensemble learning, an important technique to mitigate overfitting. In this course, Employing Ensemble Methods with scikit-learn, you will gain the ability to construct several important types of ensemble learning models. First, you will learn decision trees and random forests are ideal building blocks for ensemble learning, and how hard voting and soft voting can be used in an ensemble model. Next, you will discover how bagging and pasting can be used to control the manner in which individual learners in the ensemble are trained. Finally, you will round out your knowledge by utilizing model stacking to combine the output of individual learners. When you’re finished with this course, you will have the skills and knowledge to design and implement sophisticated ensemble learning techniques using the support provided by the scikit-learn framework.

Table of contents
  1. Course Overview
  2. Understanding Ensemble Learning Techniques
  3. Implementing Ensemble Learning Using Averaging Methods
  4. Implementing Ensemble Learning Using Boosting Methods
  5. Implementing Ensemble Learning Using Model Stacking

Advanced

Select the appropriate model for your business problem and data, and evaluate the effectiveness of that model.

Preparing Data for Modeling with scikit-learn

by Janani Ravi

Aug 12, 2019 / 3h 41m

3h 41m

Start Course
Description

Even as the number of machine learning frameworks and libraries increases on a daily basis, scikit-learn is retaining its popularity with ease. Scikit-learn makes the common use-cases in machine learning - clustering, classification, dimensionality reduction and regression - incredibly easy. In this course, Preparing Data for Modeling with scikit-learn, you will gain the ability to appropriately pre-process data, identify outliers and apply kernel approximations. First, you will learn how pre-processing techniques such as standardization and scaling help improve the efficacy of ML algorithms. Next, you will discover how novelty and outlier detection is implemented in scikit-learn. Then, you will understand the typical set of steps needed to work with both text and image data in scikit-learn. Finally, you will round out your knowledge by applying implicit and explicit kernel transformations to transform data into higher dimensions. When you’re finished with this course, you will have the skills and knowledge to identify the correct data pre-processing technique for your use-case and detect outliers using theoretically robust techniques.

Table of contents
  1. Course Overview
  2. Preparing Numeric Data for Machine Learning
  3. Understanding and Implementing Novelty and Outlier Detection
  4. Preparing Text Data for Machine Learning
  5. Preparing Image Data for Machine Learning
  6. Working with Specialized Datasets
  7. Performing Kernel Approximations

Scaling scikit-learn Solutions

by Janani Ravi

Oct 30, 2019 / 2h 55m

2h 55m

Start Course
Description

Even as the number of machine learning frameworks and libraries increases rapidly, scikit-learn is retaining its popularity with ease. scikit-learn makes the common use-cases in machine learning - clustering, classification, dimensionality reduction and regression - incredibly easy.

In this course, Scaling scikit-learn Solutions you will gain the ability to leverage out-of-core learning and multicore parallelism in scikit-learn.

First, you will learn considerations that affect latency and throughput in prediction, including the number of features, feature complexity, and model complexity.

Next, you will discover how smart choices in feature representation and in how you model sparse data can improve the scalability of your models. You will then understand what incremental learning is, and how to use scikit-learn estimators that support this key enabler of out-of-core learning.

Finally, you will round out your knowledge by parallelizing key tasks such as cross-validation, hyperparameter tuning, and ensemble learning.

When you’re finished with this course, you will have the skills and knowledge to identify key techniques to help make your model scalable and implement them appropriately for your use-case.

Table of contents
  1. Course Overview
  2. Understanding Strategies for Computational Scaling
  3. Observing the Factors Affecting Prediction Latency
  4. Implementing Scaling of Instances Using Out-of-core Learning
  5. Implementing Multicore Parallelism in scikit-learn
  6. Autoscaling of scikit-learn with Apache Spark

Model Evaluation and Selection Using scikit-learn

by Pluralsight

Nov 22, 2019 / 1h 17m

1h 17m

Start Course
Description

During the machine learning model building process, you will have to make some important decisions on how to evaluate how well your models perform, as well as how to select the best performing model. In this course, Model Evaluation and Selection Using scikit-learn, you will learn foundational knowledge/gain the ability to evaluate and select the best models. First, you will learn about a variety of metrics that you can use to evaluate how well your models are performing. Next, you will discover techniques for selecting the model that will perform the best in the future. Finally, you will explore how to implement this knowledge in Python, using the scikit-learn library. When you're finished with this course, you will have the skills and knowledge of needed to evaluate and select the best machine learning model from a set of models that you've built.

Table of contents
  1. Course Overview
  2. What Is Model Evaluation and Selection?
  3. Evaluation Methods for Classification Models
  4. Evaluation Methods for Regression Models
  5. Model Selection Techniques
  6. Putting It All Together
Offer Code *
Email * First name * Last name *
Company
Title
Phone
Country *

* Required field

Opt in for the latest promotions and events. You may unsubscribe at any time. Privacy Policy

By providing my phone number to Pluralsight and toggling this feature on, I agree and acknowledge that Pluralsight may use that number to contact me for marketing purposes, including using autodialed or pre-recorded calls and text messages. I understand that consent is not required as a condition of purchase from Pluralsight.

By activating this benefit, you agree to abide by Pluralsight's terms of use and privacy policy.

I agree, activate benefit