We have an AOAG in SQL Server 2014 SP2 CU5 (3 nodes). There is a database with Read Committed Snapshot Isolation level ON. We have a large table compressed. Some of our bigger queries on this table are performed into the secondary.
Then there is a night job on the primary node to reorganize indexes on several tables. When it hits the clustered index of the mentioned table we get the following error:
Transaction aborted when accessing versioned row in table 'xxxx' in database 'yyyy'. Requested versioned row was not found because the readable secondary access is not allowed for the operation that attempted to create the version.
At some point, the big queries were performing the reads with the hint READUNCOMMITTED
. I thought that it was the cause of this error so I removed them. But the error is still there.
Any ideas?
Current setup:
- 02 secondary is on synchronous mode
- 03 secondary on asynchronous mode
Table details
- RowCounts: 122.567.668
- TotalSpaceMB: 18.460
- UsedSpaceMB: 18.238
Definitions:
CREATE TABLE [dbo].[big_table](
[ID] [int] NOT NULL IDENTITY(1, 1),
1 [int] NULL,
2 [datetime] NULL,
3 [int] NULL,
4 [int] NULL CONSTRAINT [DF_ccc_bUnits] DEFAULT ((0)),
5 [money] NULL,
6 [money] NULL,
7 [int] NULL,
8 [int] NULL CONSTRAINT [DF_ccc_MinDays] DEFAULT ((0)),
9 [int] NULL,
10 [int] NULL,
11 [float] NULL,
12 [money] NULL,
13 [int] NULL,
14 [int] NULL,
15 [nvarchar] (200) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
16 [money] NULL,
17 [money] NULL,
18 [int] NULL,
19 [int] NULL,
20 [money] NULL,
21 [money] NULL,
22 [money] NULL,
23 [money] NULL,
24 [money] NULL,
25 [datetime] NOT NULL CONSTRAINT [DFcccadded] DEFAULT (getdate()),
26 [nvarchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
27 [money] NOT NULL CONSTRAINT [DFcccBrf] DEFAULT ((0)),
29 [money] NOT NULL CONSTRAINT [DFcccHB] DEFAULT ((0)),
30 [money] NOT NULL CONSTRAINT [DFcccFB] DEFAULT ((0)),
31 [money] NOT NULL CONSTRAINT [DFcccAllBoards] DEFAULT ((0)),
32 [money] NOT NULL CONSTRAINT [DFcccChildBrf] DEFAULT ((0)),
33 [money] NOT NULL CONSTRAINT [DFcccChildHB] DEFAULT ((0)),
34 [money] NOT NULL CONSTRAINT [DFcccChildFB] DEFAULT ((0)),
35 [money] NOT NULL CONSTRAINT [DFcccChildAllBoards] DEFAULT ((0)),
36 [int] NULL CONSTRAINT [DFcccShow_1] DEFAULT ((0)),
37 [timestamp] NOT NULL,
38 [money] NULL,
39 [money] NULL,
40 [money] NULL,
41 [money] NULL,
42 [money] NULL,
43 [money] NULL,
44 [money] NULL,
45 [money] NULL,
46 [int] NOT NULL CONSTRAINT [DFcccReleaseHour] DEFAULT ((0)),
47 [int] NULL,
48 [int] NULL,
49 [money] NULL,
50 [money] NULL,
51 [float] NULL
) ON [PRIMARY]
WITH (DATA_COMPRESSION = PAGE)
GO
CREATE UNIQUE CLUSTERED INDEX [IXccc] ON [dbo].[big_table] (1, 2) WITH (FILLFACTOR=90, DATA_COMPRESSION = PAGE) ON [PRIMARY]
GO
ALTER TABLE [dbo].[big_table] ADD CONSTRAINT [PKccc] PRIMARY KEY NONCLUSTERED ([ID]) WITH (DATA_COMPRESSION = PAGE) ON [secondary]
GO
CREATE UNIQUE NONCLUSTERED INDEX [IXcccstamp] ON [dbo].[big_table] (36) INCLUDE (1, 2) WITH (FILLFACTOR=100) ON [PRIMARY]
GO
Best Answer
So, after running out of possible solutions we opened a support case to Microsoft. They asked to run a tool to gather some info while the process was running and afterwards they analyzed it. Here is their answer:
We don't have "off business hours", we run 24/7/365. Is not a definitive answer, but at least we know the root cause of this issue. So the approach will be to temporarily change connection string so the task that is failing will read from primary AG node instead of secondary AG node the day the reindex run.