Sql-server – SQL Server Cluster Failover Making DB Inconsistency Error

clusteringsql server

I encountered the following DB tables inconsistency error whenever I do a cluster 'move' over from Server-A to Server-B and then do a DBCC CHECKDB on the database in Server-B:

SQL Server detected a logical consistency-based I/O error:incorrect
checksum. It occurs during read of page(1:142) in dtabase ID 4 at
offset 0x0000000011c))) in file
'X:\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\MSDB.mdf.

Sometimes, the above DB error may vary and occurred to other DB tables 🙁

The two servers (A & B) are using Window Server 2008 R2 Enterprise Edition (SP1). Each server has its own NAS (SUN Storage 7410), and using Double-Take Availability to create a single copy of the database across the servers. Also, each server has 2 iSCSI connectors (redundancy purpose) connected to the NAS.

The peculiar thing is the DB inconsistency error did NOT happen when I change either one of the following configuration:

  1. Move from Server-B to Server-A (no DB error on Server-A using DBCC)
  2. Software disable one of the 2 iSCSI connectors on Server-B
  3. Use internal local harddisk, instead of NAS in Server-B (Server-A still using NAS)

Anyone can help to advise me what else can I do to diagnose the problem to Server-B.

Best Answer

After many trials and errors, the fault lies in NIC card of Server-B. After I replaced it, the cluster move-over operates smoothly without any DB inconsistency ;) However, I do not understand how a NIC card can cause DB inconsistency (a 'ping' to the NIC card works well)