Expanded Library

Building Data Pipelines with Luigi 3 and Python

by Dan Tofan

Other developers implement data pipelines by putting together a bunch of hacky scripts, that over time turn into liabilities and maintenance nightmares. Take this course to implement sane and smart data pipelines with Luigi in Python.

What you'll learn

Data arrives from various sources and needs further processing. It's very tempting to re-invent the wheel and write your own library to build data pipelines for batch processing. This results in data pipelines that are difficult to maintain. In this course, Building Data Pipelines with Luigi and Python, you’ll learn how to build data pipelines with Luigi and Python. First, you’ll explore how to build your first data pipelines with Luigi. Next, you’ll discover how to configure Luigi pipelines. Finally, you’ll learn how to run Luigi pipelines. When you’re finished with this course, you’ll have the Luigi skills and knowledge for building data pipelines that are easy to maintain.

Course FAQ

What is a data pipeline?

A data pipeline is a series of data processing steps. Data pipelines consist of three components: a source, a processing step or steps, and a destination.

What prerequisites are needed for this course?

Prerequisites for this course are fluency within Python and familiarity with linux command line.

What is Luigi in python?

Luigi is a package within Python that helps you build complex pipelines of data intense jobs. Luigi handles dependency resolution, workflow management, visualization, handling failures, and command line integration.

What are the benefits of python?

Some benefits of Python are: easy to read, learn, and write, open-source, portable, dynamically typed, and provides extensive support libraries.

What are data pipelines used for?

Data pipelines are primarily used to automate the process of extracting, transforming, and loading data.

About the author

Dan started programming decades ago on a Spectrum clone and started his professional programming career in 2003. Eager to learn, Dan moved to Netherlands to study at the University of Groningen. Now, Dan is proud of his PhD thesis on decision making and knowledge acquisition in software architecture, and about a dozen publications with hundreds of citations. Dan used Microsoft technologies for many years, but migrated gradually to Python, Linux and AWS, to learn more of the computing world. Cur... more

Ready to upskill? Get started