Sql-server – Full-text: lots of FT_MASTER_MERGE waits in SUSPENDED state after multiple full-text indexes created (server hangs)

full-text-searchsql serversql server 2014

We did a test on SQL Server 2014 when we had 10 databases, 100 different schemas in each database, 10 of small (~50 rows) tables in each schema (so 10K tables in total) and we created full-text indexes on all these tables in all these databases simultaneously.

In several minutes we found that SQL Server stopped to accept any connections (except ADMIN:. connection). If we restart the server, we can connect, but in some time it hangs again. After some investigation we found that it is caused by consuming all working threads, and dm_os_tasks and dm_os_waiting_tasks showed us that there is a lot FT_MASTER_MERGE waits in SUSPENDED state. We googled that "Full-text is waiting on master merge operation", but found no more actual info about it.

We tried different full-text catalog configurations: one catalog per DB, one catalog per schema, one catalog per index. Anyway the server hangs with all these suspended tasks.

What's the root cause of the waits, how this can be fixed/mitigated?

And what is the recommended way to enable full-text on such a large amount of tables?

Best Answer

You will need to stagger the operations instead of doing all at once. Connect item does not talk about accepting new connections. But due to this wait threads are not being released (in your case) and new connectoins are not possible.

Ref:

This is a known issue with SQL Server. From the connect item:

This is caused by the way our current job scheduler is setup resulting in multiple master merge operations getting queued up but never signalled. To be clear, this only happens when multiple indexing/reorganize operations are invoked simultaneously - the indexing operation completes just fine and the results are queryable. It's only the master merge that times out and is re-scheduled for a later period of time.

Due to the complexity of the fix we have decided to wait till the next major release before triaging it. At the moment, it is advised to stagger index populations in order to not cause such timeout issues. Please let me know if you have any further questions.