On Database Comparison & Recovery

Where to start when your database is corrupt

4 min readAug 10, 2014

This Monday afternoon I got an all-too-ominous chat from our IT director, “can you come here a sec? I have a favor to ask”. Every developer knows what that means, but we love a challenge, right? Plus NewSpring Web’s unofficial team motto is “Be Optimistic” (we even created a site for it).

Long story short, two databases had gotten out of sync and caused data corruption over several days. Backups were available but we had to go back at least a week to get non-corrupted data.

Questions we immediately started asking:

1) What’s missing from our last good backup?
2) How long will it take to fix?

Question #1 seems easy enough in the age of version control systems like GitHub and Dropbox. But most database providers don’t have change tracking, so how do we find that for a database?

My first thought was to compare counts of each table:

SELECT COUNT(1) FROM [BackupDB].dbo.tblDLDocument
SELECT COUNT(1) FROM [CorruptDB].dbo.tblDLDocument

There’s just a slight problem: SQL Server doesn’t allow you to query a table with corruption issues (pretty sure MySQL doesn’t either).

Enter layer 2 of the DBA toolbox: Sys.Objects & Information_Schema. These tables are like internal indexes on your table structure, size, and code. Even if a table isn’t live, the system index will give you the last known state that table was in. I modified this query I found on Google:

USING [CorruptDB]
DBCC UPDATEUSAGE(0) WITH COUNT_ROWS, NO_INFOMSGS 
 
SELECT DBName = DB_NAME(),
 TableName = so.Name,
 Rows = SUM(CASE WHEN si.IndID IN (0,1) 
   THEN si.Rows ELSE 0 END),
 RowModCtr = MIN(si.RowModCtr) 
FROM dbo.SysObjects so,
 dbo.SysIndexes si 
WHERE si.ID = so.ID AND si.IndID IN (0,1,255) 
AND so.XType = 'U' AND PERMISSIONS(so.ID) <> 0 
GROUP BY so.Name, so.UID, so.ID
ORDER BY TableName

So far so good. Now I know the approximate amount of data affected for each database. DBCC CHECKDB gave me a lot more info, but I won’t go into that — read this article if you’re curious.

Question #2 is the real crossroads in this situation. How important is a week’s worth of data? Should we pay for a 3rd-party tool/developer to recover it? Do we have a paper/email trail to reconstruct that data? What happens if we ______ ?

Side note: we discussed all of these questions with affected parties face-to-face. I’ve learned at least one thing from our senior pastor/leadership @ NewSpring, and that’s to address issues in person: Why Email Doesn’t Work, Be Honest, & Why So Angry?.

Based on the data I pulled, we landed on two solutions: have our system provider’s development team merge the higher priority database and reconstruct data manually in the other.

Not knowing anything about the second database, I quickly needed to get all data that didn’t exist from a week prior. I started writing queries for all 34 tables and then realized something — the columns I was joining on already existed in Information_Schema. So I wrote this:

DECLARE @OldDatabase varchar(50)
DECLARE @NewDatabase varchar(50) 
SELECT @OldDatabase = 'BackupDB'
SELECT @NewDatabase = 'CorruptDB'
 
;WITH tableSchema AS (
 SELECT k.* FROM information_schema.columns c
 INNER JOIN information_schema.key_column_usage k
 ON c.column_name = k.column_name
 AND c.table_name = k.table_name 
), firstPK AS (
 SELECT table_name, column_name as 'primarykey'
 FROM tableSchema
 WHERE ordinal_position = 1
)
SELECT DISTINCT '-- ' + c.table_name + CHAR(13) + 
'SELECT new.*' + CHAR(13) +'FROM [' + @NewDatabase + '].dbo.' +
 c.table_name + ' new ' +
 CHAR(13) + 'LEFT JOIN [' + @OldDatabase + '].dbo.' +
 c.table_name + ' old ' + CHAR(13) + STUFF(( 
 SELECT ' AND new.' + c2.column_name + ' = old.' +
   c2.column_name
 FROM tableSchema c2
 WHERE c2.TABLE_NAME = c.TABLE_NAME
 GROUP BY ' AND new.' + c2.column_name +
 ' = old.' + c2.column_name
 FOR XML PATH('') 
 ), 1, 5, 'ON '
) + CHAR(13) + 'WHERE old.' + f.primarykey + ' IS NULL;' + CHAR(13)
FROM tableSchema c
INNER JOIN firstPK f
ON c.table_name = f.table_name

This matches each table from the corrupt database to the backup based on the primary key(s), and returns those rows that don’t exist in the backup:

/* Returns data from tbl_WorkSheets */
SELECT new.*
FROM [RepairedDB].dbo.tbl_WorkSheets new 
LEFT JOIN [BackupDB].dbo.tbl_WorkSheets old 
ON new.User_ID = old.User_ID 
AND new.wsID = old.wsID 
AND new.wsType = old.wsType
WHERE old.wsID IS NULL;

Once I had the missing rows, I gave it to our department to start reconstructing the rest of the data. I’m not sure whether we fully recovered what was lost, but it was a harsh lesson that we learned from, and hopefully these are queries that you never have to use.

On Database Comparison & Recovery

Where to start when your database is corrupt

Questions we immediately started asking:

Written by David Stevens