SQL Server: Change Data Capture

Learn how to capture SQL Server table data changes for downstream consumption using the Change Data Capture feature of SQL Server. This course is applicable to developers, DBAs, and architects on SQL Server 2008 onward.
Course info
Rating
(157)
Level
Intermediate
Updated
Mar 27, 2015
Duration
2h 10m
Table of contents
Introduction
Configuring Change Data Capture
Querying Change Data Using Transact-SQL
Administration
Performance Tuning and Optimization
SQL Server 2012 SSIS Components
Description
Course info
Rating
(157)
Level
Intermediate
Updated
Mar 27, 2015
Duration
2h 10m
Description

Of all the methods of tracking and storing changing data in SQL Server, the Change Data Capture (CDC) feature is by far the most flexible and easiest to implement. This course focuses on the implementation of CDC in SQL Server 2008 and higher, and includes topics such as configuration, querying change data, managing schema changes after enabling CDC, CDC feature compatibility, backup and restore considerations, performance tuning and optimization of CDC, and using the SQL Server 2012+ SSIS CDC components to perform ETL/ELT processing of change data into a data warehouse. The course is perfect for developers, DBAs, and architects on SQL Server 2008 onward, who need to capture SQL Server table data changes for downstream consumption.

About the author
About the author

Jonathan Kehayias is a Principal Consultant with SQLskills. He is the youngest person ever to achieve the Microsoft Certified Master - SQL Server 2008 certification and has been a SQL Server MVP for many years.

More from the author
SQL Server: Using SQL Sentry Plan Explorer
Intermediate
2h 4m
26 Dec 2013
More courses by Jonathan Kehayias
Section Introduction Transcripts
Section Introduction Transcripts

Introduction
Hi. This is Jonathan Kehayias of SQLskills. com, and I'm recording the SQL Server: Change Data Capture course for Pluralsight. This is module one, an Introduction to the course. So, in most businesses, there are a lot of applications that are going to have different requirements to be able to identify what types of data changes have occurred within a database for various reasons, and those reasons can include tracking historical changes to the data, auditing changes that happen to the data for security or compliance reasons, synchronizing the data across different disconnected systems, or even mobile devices, implementing an Operational Data Store for real time business reporting, or incremental loading of data into a Data Warehouse, and within SQL Server there are many different techniques that are available for tracking the changes that happen to data, and we're going to talk about change data capture in this course, but we'll also highlight some of the other techniques that exist as well in this introductory module.

Configuring Change Data Capture
Hi. This is Jonathan Kehayias from SQLskills. com, and I'm recording the SQL Server Change Data Capture course for Pluralsight. This is module two, Configuring Change Data Capture. So, within SQL Server, change data capture gives us the ability to capture the row data from DML changes to a database when it's actually enabled for capture. Configuring change data capture has specific requirements, which when met are going to allow individual tables to be configured for the change tracking, or change capture to happen. The options that are available for configuring a table for change capture are going to affect the performance of change data capture, they're going to affect the data that gets collected, and it also affects the security controls, and who can actually access the capture tables that are created as a part of the capture instance being created inside the database. In this module, we're going to take a look at the requirements for enabling CDC, and then once we've enabled it at the database level using sp_cdc_enable_table, and the options that are available for that, and how they affect the performance of change data capture.

Querying Change Data Using Transact-SQL
Hi. This is Jonathan Kehayias from SQLskills. com, and I'm recording the SQL Server Change Data Capture course for Pluralsight. This is module three, Querying Change Data Using Transact-SQL. So after we've setup our database for CDC, and we've configured each of the individual capture instances on all the source tables that we need to track data modifications for, as the change data gets populated, we have to be able to query that for processing with our ETL processes towards our data warehouse, or operational data store for real-time reporting, or to other destination systems where we need to update the data with the changes that have been made. All of the change rows within CDC, and in a capture instance, are going to be identified by the log sequence number, or LSN, that was associated with the original transaction that made the modification to the row, or changed the data inside of the database. Our change tables for our capture instances also include internal metadata columns that can be used to describe the change row, what type of operation was performed, as well as which of the capture columns were actually changed, or configured for the table that we are querying against. In this module, we'll take a look at finding change table metadata, understanding change table columns, change row table-valued functions that exist, and are created when we create our capture instances, and these facilitate being able to query against our change tables to get just the rows that we want to get back out of them, determining change rows that we actually want to process, and determining whether an individual column within the capture instance was modified during an operation.

Administration
Hi. This is Jonathan Kehayias from SQLskills. com, and I'm recording the SQL Server Change Data Capture course for Pluralsight. This is module four, Change Data Capture Administration. So the administration of CDC enabled databases requires a number of considerations that other databases might not require to prevent problems from occurring with the management and administration of the database. One of the key things is understanding how to restore a CDC enabled database in the event of a crash, because you need to make sure that you restore the database with the appropriate options to leave it with CDC enabled at the end of crash recovery when the restore process completes. Properly managing the SQL Agent jobs associated with CDC is also important so that the capture and cleanup of capture instance data occurs. If the capture job is not running, it can pin the transaction log and lead to a log file that grows out of control, and is not appropriately clearing, so you need to make sure that you have the appropriate SQL Agent jobs running when you have CDC enabled databases. Also understanding the compatibility with other features to make sure that you have a proper understanding of the interactions that are going to occur for a CDC enabled database is critical for appropriate administration of the databases. So in this module, we're going to take a look at first the SQL Agent jobs associated with change data capture, the change table cleanup that happens with change data capture, schema change management for making changes or modifications to CDC enabled tables within a database, feature compatibility, and we'll wrap up with our backup and restore considerations, and that would probably be the most important one that I would put my emphasis on as a part of administering CDC enabled databases.

Performance Tuning and Optimization
Hi. This is Jonathan Kehayias from SQLskills. com, and I'm recording the SQL Server Change Data Capture course for Pluralsight. This is module five, Performance Tuning and Optimization. Now the basic configuration, or the default configuration of change data capture is really designed to meet most application performance requirements, but there are scenarios where you might have a higher throughput workload, you have a large volume of changes, or you need minimum latency, and you can adjust, or tune some of the configuration options of change data capture to help with improving the performance. Tuning the performance of CDC may involve changing the source database design, tuning your capture job configuration options, or even modifying the capture instance configuration in certain scenarios. In this module, we'll take a look at different ways that you can tune the performance of change data capture, including storage and filegroup considerations, capture job configuration options, latency considerations, trigger considerations in the source system, whether to use net changes versus all changes, and limiting the capture columns specified when you setup a capture instance for a source table.

SQL Server 2012 SSIS Components
Hi. This is Jonathan Kehayias from SQLskills. com, and I'm recording the SQL Server Change Data Capture course for Pluralsight. This is module six, SQL Server 2012 SSIS Components. Now SQL Server 2012 introduced new SQL Server integration services, or SSIS components for change data capture, and these are targeted at simplifying the process of extracting, and consuming the change data that is being captured within the CDC capture instances. To use the CDC components, it doesn't require that you have a lot of advanced knowledge of SSIS to be able to do basic data migrations, or movement from the source system to the target for further processing. In this module, we're going to take a look at the CDC control task, the CDC source task, the CDC splitter task, and package design considerations around using the change data capture SSIS components for SQL Server 2012 and higher.