Sql-server – Database corruption extends beyond nonclustered index issue in SQL Server

sql serversql-server-2008-r2

Background

I have a server running SQL Server 2008 R2 on the matching MS OS. Over the last few months this server had a RAID drive failure on two separate occasions (which simply required hotswapping a new drive and re-building the RAID), but that did not affect the database itself which is running on an SSD on a different RAID cluster.

1) I have an error in an index where the index value for a column is invalid and does not match the actual data

2) I probably have other errors that aren't being properly diagnosed by DBCC CheckDB, as I had a query fail that DOES NOT use the corrupt index.

Issue 1 (solved by comment here)

Note: this issue is addressed by the comment linked, but is left for context as to issue 2. Plus, I have to drop and re-create the index still;)

This is from DBCC CheckDB

Msg 8951, Level 16, State 1, Line 1
Table error: table 'TidBitData' (ID 1835869607). Data row does not have a matching index row in the index 'I_TBD_Fn' (ID 8). Possible missing or invalid keys for the index row matching:
Msg 8955, Level 16, State 1, Line 1
Data row (1:3867082:9) identified by (Date_Time = '2011-09-04 22:02:49.000' and TidBit_SerialNum = 1232451) with index values 'Filename = '2012 T1 tidbit`1232451 Antioch North_12T1.csv' and Date_Time = '2011-09-04 22:02:49.000' and TidBit_SerialNum = 1232451'.
Msg 8952, Level 16, State 1, Line 1
Table error: table 'TidBitData' (ID 1835869607). Index row in index 'I_TBD_Fn' (ID 8) does not match any data row. Possible extra or invalid keys for:
Msg 8956, Level 16, State 1, Line 1
Index row (1:6267114:32) with values (Filename = '2012 T1 tidbit`1232451 Antâ…©och North_12T1.csv' and Date_Time = '2011-09-04 22:02:49.000' and TidBit_SerialNum = 1232451) pointing to the data row identified by (Date_Time = '2011-09-04 22:02:49.000' and TidBit_SerialNum = 1232451).

The following:
ALTER INDEX I_TBD_Fn on db.schema.TidBitData REBUILD
just changed the index row in error to (1:6630769:43) (from 1:6267114:32) for the same data row (1:3867082:9)
as reported by a re-running of DBCC CheckDB

Issue 2

I initially ran DBCC CheckDB (for issue 1) because the following query

SELECT COUNT(*) FROM 
    (SELECT x, y, COUNT(*) as cnt
     FROM ATable
     WHERE [ADate]>'2014-01-01' group by x, y) as bob

in SMSS failed with

Msg 682, Level 22, State 146, Line 7
Internal error. Buffer provided to read column value is too small. Run DBCC CHECKDB to check for any corruption.

No errors were returned by DBCC CHECKDB pointing to ATable (which is not the TidBitData table)

Current Status

My IT staff pointed me at an MS KB that mentioned my issue 2 with regard to SQL server 2005 (which my database is not). The potentially relevant portion reads:

This problem occurs because temporary tables are cached in SQL Server 2005. The cached temporary tables are reused if the table schema remains the same. However, a cached temporary table may be incorrectly reused even if the table schema has changed.

However, even assuming that the inner query was cached to the TempDB, I wouldn't expect the schema to have changed in either my database or the TempDB. This error no longer presented itself when I changed the outer portion of the query to read

SELECT
    COUNT(*) as XYGroupCount,
    SUM(cnt) as TotalRecordsInRange
FROM ...

I am still concerned that this error ever popped up. As I said, the database schema for neither my tempdb nor my actual database should have changed. I ran DBCC CHECKCONSTRAINTS and DBCC CheckTable (x) WITH ALL_ERRORMSGS, EXTENDED_LOGICAL_CHECKS for the two tables without error. I then ran

DBCC UpdateUsage(mydb)
DBCC UpdateUsage(tempdb)
DBCC CheckDB(mydb) with ALL_ERRORMSGS, EXTENDED_LOGICAL_CHECKS
DBCC CheckDB(tempdb) with ALL_ERRORMSGS, EXTENDED_LOGICAL_CHECKS
DBCC CheckDB(master) with ALL_ERRORMSGS, EXTENDED_LOGICAL_CHECKS
DBCC CheckDB(msdb) with ALL_ERRORMSGS, EXTENDED_LOGICAL_CHECKS
DBCC CheckDB(model) with ALL_ERRORMSGS, EXTENDED_LOGICAL_CHECKS

The DBCC CheckDB(tempdb) appears to fail with warning but without formal error:

DBCC results for 'tempdb'.
DBCC CHECKDB will not check SQL Server catalog or Service Broker consistency because a database snapshot could not be created or because WITH TABLOCK was specified.

All other databases report 0 errors. e.g.

CHECKDB found 0 allocation errors and 0 consistency errors in database 'model'.

Final Thoughts/Questions

Are these issues related to the RAID drive failure? Is there an issue with the RAM?

Best Answer