Implementing ETL Pipelines on the Microsoft SQL Server Platform
This course will teach you how to use the SQL Server database engine and Integration Services (SSIS) to develop, deploy, and monitor an ETL pipeline with the SQL Server data stack for dimension and fact tables.
What you'll learn
Implementing pipelines for dimensional models is very common in SQL Server with Integration Services. In this course, Implementing ETL Pipelines on the Microsoft SQL Server Platform, you will learn to use Integration Services (SSIS) to create packages to populate dimension and fact tables. Next, you will understand how to use Lookup, Data Transformation, Conditional Split, and OLEDB components to complete a pipeline for ETL processes. Finally, you will learn how to stage data to execute batch update statements for improving update performance of changing data. When you’re finished with this course, you’ll have the skills and knowledge of Implementing ETL Pipelines on the Microsoft SQL Server Platform needed to develop SSIS packages that can be deployed to production and monitored for performance and errors.
Table of contents
- T-SQL Source Query for Fact 2m
- Demo: Extracting Fact Data 3m
- Sourcing Fact Data for Pipeline 2m
- Demo: Source Component for Pipeline 3m
- Lookup Transformation for Surrogate Keys 2m
- Demo: Lookup Transformation 3m
- Missing Dimension Lookup 1m
- Demo: Derived Column for Missing Dimension Lookup 2m
- Stage Updates 2m
- Demo: Stage Updates 8m
- Summary: Loading Data into Fact Tables 1m
Before beginning this course, you will want to be familiar with the SQL Server database engine, including creating tables, indexes, and foreign keys. You should also have SQL Server data tools installed for developing ETL pipelines using Integration Services, or SSIS.
SQL Server Management Studio should be installed for managing the SQL Server instance.
Microsoft SQL Server is a relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network.
An ETL Pipeline refers to a set of processes extracting data from an input source, transforming the data, and loading into an output destination such as a database, data mart, or a data warehouse for reporting, analysis, and data synchronization. The letters stand for Extract, Transform, and Load.
ETL tools break down data silos and make it easy for your data scientists to access and analyze data, and turn it into business intelligence. In short, ETL tools are the first essential step in the data warehousing process that eventually lets you make more informed decisions in less time.