Sql-server – Mysterious disappearing consistency errors in SharePoint DBs

corruptionsharepointsql serversql-server-2008-r2

We have a pre-production SharePoint SQL server which threw an 824 error yesterday evening on the Content DB. As part of my investigation today, I ran DBCC CHECKDB on this DB and it came up clean. Puzzling. So, I ran CHECKDB on all 23 DBs on this server and two different DBs turned up errors, the Synch DB and the CrawlStore DB.

Then, things started to get really odd. For one, I ran a CHECKTABLE on the table mentioned in the output from the Synch DB but it came up clean. A CHECKDB against the entire DB also came up clean. I attempted to do the same on the CrawlStore but it continually presented me with errors regarding creating the database snapshot. So, I cloned the DB from a backup set and, of course, CHECKDB ran clean.

And, in there somewhere, I also got an assertion error.

I'm not sure where to go from here. I'm worried about corruption but I don't have any actual corruption to point to at this time.

System is:

Microsoft SQL Server 2008 R2 RTM, Standard Edition, x64
VMWare VM with 2 vCPUs and 8 GB memory on NetApp SAN storage
OS: Win 2008 R2 Standard, 64 bit.

Snapshots are an Enterprise feature so I can't create a snapshot (I actually tried) to test the CHECKDB. Closest I can get is a clone on the one that's complaining about snapshot creation.

I looked at my suspect_pages earlier today. Except for a row from January (also on the content DB but a different page), I have this:

DBName file_id page_id event_type error_count SP2010Dev_Portal_Content_DB 1 65543 2 2 SP2010Dev_Portal_Content_DB 1 22211 2 3 SP2010Dev_Sync_DB 1 54755 1 2 Search_Service_Application_CrawlStoreDB_8c648a692b62438888fe7154075a7d2b 1 52958 2 2 SP2010Dev_Sync_DB 1 802464 2 1

Errors:

Error: 824, Severity: 24, State: 2.

SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x4d448744; actual: 0xe9b1634d). It occurred during a read of page (1:22211) in database ID 23 at offset 0x0000000ad86000 in file 'G:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Data\SP2010Dev_Portal_Content_DB.mdf'.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.  

Failed:(-1073548784) Executing the query "DBCC CHECKDB(N'SP2010Dev_Sync_DB')  WITH NO_INFOMS..." failed with the following error: "Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 7998501791723421696 (type Unknown), page (57600:0). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12716041 and -10.
Object ID 357576312, index ID 0, partition ID 72057594043301888, alloc unit ID 72057594045792256 (type In-row data): Page (1:54755) could not be processed.  See other errors for details.
CHECKDB found 0 allocation errors and 1 consistency errors not associated with any single object.
CHECKDB found 0 allocation errors and 1 consistency errors in table 'mms_step_object_details' (object ID 357576312).
CHECKDB found 0 allocation errors and 2 consistency errors in database 'SP2010Dev_Sync_DB'.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.

Failed:(-1073548784) Executing the query "DBCC CHECKDB(N'Search_Service_Application_CrawlSto..." failed with the following error: "Object ID 453576654, index ID 1, partition ID 72057594055491584, alloc unit ID 72057594058899456 (type In-row data): Page (1:52958) could not be processed.  See other errors for details.
Table error: Object ID 453576654, index ID 1, partition ID 72057594055491584, alloc unit ID 72057594058899456 (type In-row data), page (1:52958). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12716041 and -4.
Table error: Object ID 453576654, index ID 1, partition ID 72057594055491584, alloc unit ID 72057594058899456 (type In-row data). Page (1:52958) was not seen in the scan although its parent (1:288931) and previous (1:52957) refer to it. Check any previous errors.
Table error: Object ID 453576654, index ID 1, partition ID 72057594055491584, alloc unit ID 72057594058899456 (type In-row data). Page (1:52959) is missing a reference from previous page (1:52958). Possible chain linkage problem.
CHECKDB found 0 allocation errors and 4 consistency errors in table 'MSSCrawlURL' (object ID 453576654).
CHECKDB found 0 allocation errors and 4 consistency errors in database 'Search_Service_Application_CrawlStoreDB_8c648a692b62438888fe7154075a7d2b'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (Search_Service_Application_CrawlStoreDB_8c648a692b62438888fe7154075a7d2b).". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.

Best Answer

I would be confident you have corruption. Sounds like at least one corrupt data page. What does the contents of msdb.dbo.suspect_pages look like? Refer to http://msdn.microsoft.com/en-us/library/ms191301(v=sql.105).aspx for parsing some of that info.