2016-03-21

SQL Server Maintenance: The Overlooked Index

SQL Server Full Text Search

Quite a lot has been written about SQL Server index maintenance. One area that doesn't get much attention is full text index maintenance. Full text indexes can suffer from fragmentation, which can lead to degraded performance. Let's look at one way to address this.


Full-Text Index Structure

As the MSDN documentation tells us, a logical full text index consists of one or more internal tables called fragments. When data from the source table is updated, additional fragments are created. Internally, full text queries need to "read" all of the fragments. As a result, too many fragments can lead to degraded performance.

The query below will identify full text catalogs with fragmented full text indexes. It returns the name of the full text catalog, the schema and table name a full text index is built on, and the number of fragments in the full text index.

SELECT c.Name FullTextCatalogName, OBJECT_SCHEMA_NAME(i.object_id) SchemaName,
 OBJECT_NAME(i.object_id) TableName, COUNT(*) FragmentCount
FROM sys.fulltext_catalogs c
JOIN  sys.fulltext_indexes i
 ON i.fulltext_catalog_id = c.fulltext_catalog_id
JOIN sys.fulltext_index_fragments f
 ON f.table_id = i.object_id 
GROUP BY c.Name, i.object_id, f.table_id 
HAVING COUNT(*) > 1


To improve performance, the multiple fragments can be "merged" into a single fragment. This can be done via ALTER FULLTEXT CATALOG with the REORGANIZE option. The script below creates a stored procedure that iterates over non-system databases and runs some dynamic tsql. Full text catalogs that have at least one full text index with multiple fragments are reorganized.


CREATE PROCEDURE dbo.ReorganizeFullTextCatalogs
AS 
/*
    Purpose: 
    Reorganizes the FullText Catalogs (as needed) on all user databases.
 
    Inputs: None

    History:
    02/25/2014 DMason Created
    http://itsalljustelectrons.blogspot.com/
 
*/
DECLARE @MajorVersion INT

SELECT @MajorVersion = 
 CAST(LEFT(d.ProductVersion, CHARINDEX('.', d.ProductVersion) - 1) AS INT)
FROM ( SELECT CAST(SERVERPROPERTY('ProductVersion') AS VARCHAR) AS ProductVersion ) d

--SQL 2008 or later.
IF @MajorVersion < 10
    RETURN;

--This is the tsql statement that gets executed on each db.
DECLARE @InnerSql NVARCHAR(MAX) 
SET @InnerSql =
    'DECLARE @Tsql NVARCHAR(MAX)
    DECLARE @FtcName SYSNAME
    DECLARE curFtcName CURSOR FAST_FORWARD READ_ONLY FOR
        SELECT DISTINCT FullTextCatalogName
        FROM (
            SELECT c.Name FullTextCatalogName, COUNT(*) FragmentCount
            FROM sys.fulltext_catalogs c
            JOIN  sys.fulltext_indexes i
                ON i.fulltext_catalog_id = c.fulltext_catalog_id
            JOIN sys.fulltext_index_fragments f
                ON f.table_id = i.object_id 
            GROUP BY c.Name, f.table_id 
            HAVING COUNT(*) > 1
        ) TableIndexFragments

    OPEN curFtcName
    FETCH NEXT FROM curFtcName INTO @FtcName

    WHILE @@FETCH_STATUS = 0
    BEGIN
        SET @Tsql = ''''ALTER FULLTEXT CATALOG ['''' + @FtcName + ''''] REORGANIZE''''
        PRINT @Tsql
        EXEC (@Tsql)
        FETCH NEXT FROM curFtcName INTO @FtcName
    END

    CLOSE curFtcName
    DEALLOCATE curFtcName'
-------------------------------------------------------
--Iterate through the db's.
DECLARE @Tsql NVARCHAR(MAX) 
DECLARE @DB SYSNAME
DECLARE curDB CURSOR FAST_FORWARD READ_ONLY FOR
    SELECT name
    FROM sys.databases
    WHERE User_Access_Desc = 'MULTI_USER'
    AND State_Desc = 'ONLINE'
    AND Is_Read_Only = 0
    AND Is_In_Standby = 0
    AND name NOT IN ('master', 'tempdb', 'model', 'msdb')
    ORDER BY name

OPEN curDB
FETCH NEXT FROM curDB INTO @DB

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @Tsql = '[' + @DB + ']..sp_executesql N''' + @InnerSql + ''''
    EXEC sp_executesql @Tsql;

    FETCH NEXT FROM curDB INTO @DB
END

CLOSE curDB
DEALLOCATE curDB
GO

Personal Experience

My first encounter with full text indexes and degraded performance was related to an enhancement I made to an aspx page years ago. I wanted all of the search fields to use an AutoComplete AJAX extender to mimic the behavior you see when you type a few letters into the search field on Google.com or Bing.com. A traditional non-clustered index wasn't sufficient for the "Location Address" field, so I settled on a full text index--it worked very well.

After some amount of time (I don't remember how long), performance slowed considerably. I was surprised to find the full text index for "Location Address" had a large number of fragments. I wish I had kept some notes on my findings. I can't remember how may fragments there were, but I'm thinking it was in the 15-20 range. If memory serves me, Orange Co., FL has about 400,000 physical location addresses. The underlying table had one row per location address. Knowing me, the indexed column was probably VARCHAR(100) or VARCHAR(128). This does't seem like a huge amount of data, so I was surprised the full text searches were slow, even with 15-20 fragments. Reorganizing the related full text catalog made a world of difference. Performance improved drastically.


Questions I Haven't Answered Yet

A full text catalog can contain multiple full text indexes. Is there a way to reorganize just a single full text index? Does SQL Server automatically reorganize a full text index that has only one fragment, even though it's not necessary? (Perhaps another blog post is in order.)


Other Thoughts

The set of data I was working with was small. I could reorganize the full text catalog quickly. If you are working with considerably larger sets of data, proceed with caution. If you want to run the stored proc above, start out in a test environment to see how long it takes. In my production environments, I have a maintenance window on the weekends when I run the code. I don't bother during the week.


SHARE