SQL Server: Detecting and Correcting Database Corruption

Database corruption happens all the time, all over the world. Learn how to easily and automatically detect corruption, and then how to recover from corruption with the minimum of downtime and data loss using restore or repair, applicable to anyone who is responsible for SQL Server databases, from SQL Server 2005 onward
Course info
Rating
(241)
Level
Intermediate
Updated
Sep 18, 2013
Duration
3h 52m
Table of contents
Introduction
Causes of Database Corruption
Detecting Page Corruption
Consistency Checking
DBCC CHECKDB and Related Commands
Interpreting DBCC CHECKDB Output
Simple Restore Techniques
Simple Repair Techniques
Description
Course info
Rating
(241)
Level
Intermediate
Updated
Sep 18, 2013
Duration
3h 52m
Description

Database corruption happens all the time, all over the world. When it happens in your environment, do you know what to do? Will you realize you have corruption? Written and presented by the person who wrote SQL Server's consistency checking and repair tools, this course will show you what you need to know to detect and recover from most cases of database corruption. The course starts by explaining why timely corruption detection is so important, and then investigates the various causes of database corruption. You'll learn how to configure SQL Server to automatically detect when I/O errors occur, what consistency checks are, the DBCC commands to use to perform consistency checks, and how to configure SQL Server to perform consistency checks regularly. The course then moves on to interpreting the output from consistency checks so you'll know when you have corruption in your environment, plus whether and how the corruption limits your recovery options or not. Finally, the course ends with modules that describe and demonstrate how to recover from corruption using simple restore techniques and simple repair techniques. Packed with a wealth of information and practical, easy-to-follow demonstrations, this course will prepare you to detect and recover from database corruption when it happens to you. The course is applicable for all SQL Server versions from SQL Server 2005 onward, and for anyone responsible for SQL Server databases with any level of experience.

About the author
About the author

Paul is the CEO of SQLskills.com, a Microsoft Regional Director and a SQL Server MVP. He worked on the SQL Server Storage Engine team at Microsoft from 1999 to 2007. Among other things, he wrote DBCC CHECKDB/repair for SQL Server 2005 and was responsible for the Core Storage Engine during SQL Server 2008 development.

More from the author
More courses by Paul Randal
Section Introduction Transcripts
Section Introduction Transcripts

Causes of Database Corruption
Hi, this is Paul Randal from SQLSkills. com and I'm recording this course for Pluralsight. This is the SQL Server Detecting and Correcting Database Corruption course and you're watching Module 2: Causes of Database Corruption. Introduction. As I mentioned in the first module, almost every DBA is going to see corruption at some point in their career, so it's really useful to understand the different causes of corruption and why it's so inevitable that you're going to see corruption at some point. In this module, we're going to talk about what the various causes of corruption are, mostly the I/O subsystem, but there are some other things, what things do not cause corruption and how you can't cause corruption doing things with SQL Server, how corruption can propagate to remote servers under ceratin circumstances, but mostly it doesn't, and then a phenomenon that I call disappearing corruption, what that is, and why you might see it.

Detecting Page Corruption
Hi, this is Paul Randal from SQLskills. com and I'm recording this course for Pluralsight. This is the SQL Server Detecting and Correcting Database Corruption course and you're watching Module 3: Detecting Page Corruption. SQL Server has a bunch of mechanisms built into it that allows it to automatically detect when a problem occurs when it's reading a page from disk, whether that page is corrupt for instance or the I/O just fails. It also has mechanisms that allow it to tell you when problems have occurred rather than just detecting them. And that's what we're going to talk about in this module. We're going to cover torn-page detection, page checksums and how these two are different, the various different kinds of I/O errors and how they're going to appear to you, and then how you can monitor for I/O errors.

Consistency Checking
Hi, this is Paul Randal from SQKskills. com and I'm recording this course for Pluralsight. This is the SQL Server: Detecting and Correcting Database Corruption course, and you're watching Module 4: Consistency Checking. Introduction. Now we've talked about how SQL Server can automatically discover corruption through page protection options, but that only works if the corrupt pages are going to be read from disk. There could be parts of a database that are very rarely used and are corrupt on disk and you're not going to know that. So there is a method you can use to force all the allocated pages in the database to be used, and that's consistency checking. So you need to run regular consistency checking to help SQL Server detect that corruption is there, because remember, the sooner you detect corruption, the more likely you can recover with the minimum of downtime and data loss. So in this module, we're going to talk about consistency checking using backup checksums and frequently asked questions about both of these.

DBCC CHECKDB and Related Commands
Hi, this is Paul Randal from SQLskills. com and I'm recording this course for Pluralsight. This is the SQL Server: Detecting and Correcting Database Corruption course, and you're watching Module 5: DBCC CHECKDB and Related Commands. Introduction. Most of the time when you're going to be running consistency checks you're going to be running DBCC CHECKDB because that's the main consistency checking command, or maybe you're going to be running one of the kind of subsidiary DBCC commands that has a subset of the overall consistency checking functionality that there is inside DBCC CHECKDB. Just at the start of this module, I'd like to give a link to the Books Online for DBCC in general at that bit. ly link there to save me having to reference it all the way through the deck. Now that we've discussed consistency checking in the previous module, in this module we're going to actually look at the DBCC commands themselves. We'll start off with an overview of the main command, DBCC CHECKDB, and then we'll look at the other commands that implement subsets of the checks that CHECKDB does. We'll talk about what consistency checks CHECKDB actually does. We'll talk about the different options that you have for the various CHECK commands and how they change the behavior of those commands, some of the problems that can crop up when you're running consistency checks using DBCC CHECKDB, and then we'll look at how you can break up consistency checks over time to be more efficient with your database maintenance windows.

Interpreting DBCC CHECKDB Output
Hi, this is Paul Randal from SQLskills. com, and I'm recording this course for Pluralsight. This is the SQL Server: Detecting and Correcting Database Corruption course, and you're watching Module 6: Interpreting DBCC CHECKDB Output. Introduction. Anytime you're looking at the output from DBCC CHECKDB, it's really important that you're able to recognize when it's telling you that you've got corruption so that you know you're going to have to do something to solve that corruption problem. It's also useful to be able to tell when certain types of corruptions are going to limit what you're going to be able to do. For instance, corruptions that prevent CHECKDB from running properly, that's going to stop you being able to run repair. Now unfortunately, the output from CHECKDB is kind of cryptic and can be hard to understand, but there are a few tips and tricks that you can use to try and figure out whether it's telling you something that's going to cause CHECKDB not to be able to run for instance. So in this module, what we're going to cover is interpreting DBCC CHECKDB output at a high level, some simple examples of corruptions and how they're going to appear in the output of CHECKDB, and then a restore versus repair decision that you might be able to make.

Simple Restore Techniques
Hi, this is Paul Randal and I'm recording this course for Pluralsight. This is the SQL Server: Detecting and Correcting Database Corruption course, and you're watching module 7, Simple Restore Techniques. Introduction. It's very true that when you're recovering from a disaster, restoring from backups is the absolute best way to avoid data loss, but only as long as you've got the correct backups, the ability to restore right up to the point that the disaster occurred. Now when you are recovering from a diaster, it's really important that you minimize the time that your system is down, so you need to know what the various restore options are and how to use them. With that end in mind, in this module, we're going to cover some of the various restore options that you can use, point-in-time restores, tail-of-the-log backups, why they're important and how you can use them, and then restore sequences.

Simple Repair Techniques
Hi, this is Paul Randal from SQLskills. com, and I'm recording this course for Pluralsight. This is the SQL Server: Detecting and Correcting Database Corruption course, and you're watching Module 8: Simple Repair Techniques. Introduction. Most of the time you're going to be using repair because you don't have the backups necessary to be able to restore with no data loss. It's unlikely unless your service level agreements say that downtime is more important than data loss that you're going to choose repair as a first choice. Now you have to be very careful when you're using repair or when you're manually fixing corruptions as it may involve data loss, and you don't want to make things worse. That means that you're going to have to practice using repair to make sure you know what it's going to do, or definitely practice when you're manually fixing corruptions before doing it for real in production. In this module then, we're going to cover how repair works and a whole bunch of different things about repair, REPAIR_ALLOW_DATA_LOSS and misconceptions around repair, the various repair options, and then how to manually fix a few different kinds of corruptions.