Course

Skills Expanded

Scaling Python Data Applications with Dask 1

Learn how to work with very large datasets without leaving familiar and rich Python data ecosystem. This course will teach you how to leverage power of Dask library in order to handle data that is too big for regular tools like Pandas or NumPy.

Preview this course

What you'll learn

Working with so-called ‘Big Data’ can be a daunting task and many tools that solve this problem have a very steep learning curve. Also, developers familiar with Python may not want to resort to solutions built on another technology stack. In this course, Scaling Python Data Applications with Dask 1, you will gain the ability to work with very large datasets using a Python-native and approachable tool. First, you will learn how to use Dask when your application written using standard Python stops working because of the growing size of the data. Next, you will discover how Dask works underneath and what techniques it uses to make processing large datasets in various scenarios possible and accessible. Finally, you will explore how to exchange Pandas and NumPy for their Big Data variants, with practically no changes to the code. When you’re finished with this course, you will have the skills and knowledge of Dask needed to confidently write data applications that scale, using exclusively Python stack.

Course Overview

1min

Course Overview 1m

Understanding Dask

17mins

Scaling Simple Python Data Apps

19mins

Introduction 3m
Demo - Vanilla Python 5m
Dask Bags 7m
Demo - Dask Bags 5m

Dask Internals and Dashboard

17mins

Introduction 3m
Computation Model 3m
Runtime 3m
Demo - Part 1 3m
Demo - Part 2 6m

Scaling NumPy and Pandas

18mins

Introduction 2m
Dataset 3m
Arrays - Overview 4m
Arrays - Demo 1 3m
Arrays - Demo 2 2m
DataFrames 3m
Summary 1m

Beyond Single Machine

6mins

Introduction 2m
Demo 4m
Summary 1m

About the author

Paweł Kordek

Paweł is a software engineer passionate about knowledge sharing. He's especially focused on processing and exploring data sets (be it big or small) and is always searching for emerging tools that will make working with data simpler in the future. Paweł is currently at Farfetch where he develops data applications and sees his favorite tools like Pandas, Kafka or ElasticSearch (just to name a few) being applied to solve complex business problems. He mostly works with JVM languages and Python, but ... more

See more courses by Paweł Kordek

Ready to upskill? Get started

Contact Sales

Scaling Python Data Applications with Dask 1

What you'll learn

Table of contents

About the author

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Contact Sales

Scaling Python Data Applications with Dask 1

What you'll learn

Table of contents

About the author

Get access now

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Ready to skill up
your entire team?

Ready to skill up
your entire team?