What guidelines should be considered for maintaining full-text indexes?
Should I REBUILD or REORGANIZE the full-text catalog (see BOL)? What is a reasonable maintenance cadence? What heuristics (similar to the 10% and 30% fragmentation thresholds) could be used to determine when maintenance is needed?
(Everything below is simply extra information elaborating on the question and showing what I've thought about so far.)
Extra info: my initial research
There are a lot of resources on b-tree index maintenance (e.g., this question, Ola Hallengren's scripts, and numerous blog posts on the subject from other sites). However, I have found that none of these resources provide recommendations or scripts for maintaining fulltext indexes.
There is Microsoft documentation that mentions that defragmenting the b-tree index of the base table and then performing a REORGANIZE on the full-text catalog may improve performance, but it doesn't touch on any more specific recommendations.
I also found this question, but it is primarily focused on change-tracking (how data updates to the underlying table are propagated into the fulltext index) and not the type of regularly scheduled maintenance that could maximize the efficiency of the index.
Extra info: basic performance testing
This SQL Fiddle contains code that can be used to create a full-text index with
AUTO change tracking and examine both the size and query performance of the index as data in the table is modified. When I run the script's logic on a copy of my production data (as opposed to the artificially manufactured data in the fiddle), here is a summary of the results I am seeing after each data modification step:
Even though the update statements in this script were fairly contrived, this data seems to show that there is a lot to be gained by regular maintenance.
Extra info: Initial ideas
I'm thinking about creating a nightly or weekly task. It seems that this task could perform either a REBUILD or REORGANIZE.
Because the full-text indexes may be quite large (tens or hundreds of millions of rows), I'd like to be able to detect when indexes within the catalog are sufficiently fragmented that a REBUILD/REORGANIZE is warranted. I'm a little bit unclear on what heuristics might make sense for that.