Full-Text Searching in MS SQL Server 2005 Express

SQL Server has a very powerful Full Text indexing capability, which is even better in SQL Server 2005. I use the free editions of Visual Web Developer Express and SQL Server 2005 Express, but these still allow the full capability of full-text searching to be performed. This means that you can have a large database of long text fields and yet find all records containing certain words very quickly just by executing a SQL query.

Note: If you are using SQL Server Express, make sure you have downloaded the version with full-text indexing. There is a smaller version without it, but you must have the full version. If you have already installed the smaller version, you may need to uninstall that and install the full version.

Getting started

  1. Microsoft have a background document on Full Text Search concepts and I suggest that you read that first.
  2. The fundamental fact of full-text indexing is that the database requires extra files called the catalog. You still have the normal .mdb and .ldb files, but you also have a folder called FTData which contains a folder for your catalog, which in turn contains lots of files that make up the full text index. These files are all made for you and kept up to date by SQL Server.
  3. How to make the Catalog
    Now must execute some SQL commands to create the full-text catalog. The catalog itself is a container to put one or more full-text indexes into. So, if you want to be able to search on the text of three of your database tables, you make a single catalog and create three indexes which go into that catalog. Note that if your indexes are going to be really huge you should make a separate catalog for each index, but to get started, stick with one catalog.
    WARNING: Some of the information of the Microsoft website relates to the full version of SQL Server, such as the "Storage folder" that Express doesn't have. Don't panic, you just need to create the catalog by hand, rather than through the Management Interface. Also, you may find out-of-date information about SQL Server 2000, telling you that a table can only have one column full-text indexed. Ignore this, as it is not true under SQL Server 2005.
    In this example I will make a catalog and put two indexes into it. Here is how:
    1. Start SQL Server 2005 Management Studio Express, and connect to your database.
    2. Click the "New Query" icon so that you can start entering SQL queries.
    3. Type:
      use mydatabase
      go
      EXEC sp_fulltext_database 'enable'
      go
      CREATE FULLTEXT CATALOG mycatalog
      go
    4. Click "Execute" to run this SQL statement. This will enable fulltext searching on the database (it is not enabled by default), and then create the catalog, which will be called mycatalog. The catalog is just a folder called "mycatalog" in Microsoft SQL Server\SQL.1\FTData. Note here again that if your data tables are huge, this catalog could get huge too, so may be worth putting on a different disk to the one with the database on it, for performance reasons.
  4. How to make the full-text index
    The rules state that if you want to make a fulltext index on a table, that table must have a "unique, single-column, non-nullable index". Very often your primary key will already be just like this, because all primary keys are unique and non-nullable. However, if your primary key is on more than one column, you have a slight problem. You will need to add an extra column to the table that is unique, and add a new index that is non-nullable on this column. The standard way to do this is to create a new column called "text_id" which is an integer column, and make it an identity with a seed of 1 and an increment of 1. If you do this, and add a new index on just that column, you will have a unique, single-column, non-nullable index on your table. Ok, so it wastes a bit of space, but you have to obey this rule or things don't work.
    If you already have a single-column primary key but it is on a really large field such as a GUID, it is suggested that you also add an int field and use that instead as the key field. This is because the full-text index will contain a copy of all the GUIDs, and so will take up a lot of space.

And now, how to create the fulltext index itself:

    1. If you needed to add a new field, make sure you have created an index on it; otherwise just use the PK.
    2. Execute this SQL:
      CREATE FULLTEXT INDEX ON mydatabase.dbo.mytable
      (
      column_to_be_indexed
      Language 2057
      )
      KEY INDEX myindex ON mycatalog
      WITH CHANGE_TRACKING AUTO

      Here, mytable is the name of the table that we are building the index on, and column_to_be_indexed is the name of the column of text that you want to be able to search on. myindex is the name of the index to use (which is most likely your Primary Key), and mycatalog is the name of the catalog you previously created.
      The "Language 2057" part tells SQL Server which language is used in the texts, such as English or French. This is the really clever part, because SQL Server knows that in English "mice" is the plural of "mouse", so a search on "mouse" will return records that contain "mice" too. In this case I used 2057 which is the code for UK English. Microsoft have a table of different codes for different languages, and you should pick the appropriate one. There is also a code 0X0 which means "no language specified".
    3. Now, to verify that things have worked as planed, start SQL Server Manager interface and right-click on your table and select "Modify". Find the column that you wanted to be indexed and right-click it, and choose "Full text index...". Check that "Columns" is set to the column name you want indexed and check whether "Active" is set to "Yes". If it isn't then the index hasn't yet been built.
    4. You can find the status of your fulltext indexes like this:
      exec sp_help_fulltext_columns mytable
      This shows you which columns in given table are configured for full-text indexing. Also:
      select * from sys.fulltext_indexes
      shows you the status of the current indexes and when they were last updated. The key thing is to make sure that change_tracking_state_desc is set to "AUTO" and that has_crawl_completed = "1". You can also see the time of the last crawl, meaning the time that the index was last updated.
    5. You will have to wait a while for the index to be created. If you have a large table, the CPU usage on your PC will probably go up to 100% while this happens. There is a trouble-shooting section further down this page if you get problems.
  1. Performing a Find on your index
    The SQL SELECT statement to find records is quite simple:
    SELECT mycolumn
    FROM mytable
    WHERE CONTAINS(mycolumn, '"find this"')

    (You must put the search words in double-quotes within the single-quotes needed by SQL Server)
  2. Eliminating Noise Words
    There is a noise file for each language, and this contains simple words like "and" and "or" that are not worth searching for. If your text fields are in HTML or XML, then you should add words like "BR" and "P" to the noise words list.

My experiences with full-text indexes

When you first set up a full text index, SQL server takes a long time to build the catalog, if the tables are large. This really slows down your PC, but full text searching still works during this time, all be it very very slowly. After a while the system sorts itself out and queries become very rapid, even instant.

Troubleshooting Full Text Searching

Finding errors in the Application Events log

On your desktop, right-click "My Computer" and choose "Manage" then "Event Viewer" then "Application". This shows you the Application Event Log, and in there you will see any error messages created by the Full Text indexing service. Usually these errors will redirect you to the server logs themselves for more information.

SQL Server Logfile & Fulltext Index Log-file

Have a look in the logs. I found mine in C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG. The fulltext logs are called SQLFTxxxxxx.LOG. They are pretty difficult to understand, but the answer is in there somewhere!

Try ReBooting

You need to reboot after you install FullText Indexing, and a reboot is always a good place to start when troubleshooting. I had the error message "Error '0x80040e09' occurred during full-text index population" appearing in my logfiles, and no fulltext indexes created at all until I figured out that I needed to reboot.

Commands & Queries that you can type into SQL Server 2005:

How to check the status of the full-text indexes in a catalog:

USE mydatabase;
GO
EXEC sp_help_fulltext_tables mycatalog;
GO

How to enable fulltext search on a table:

USE mydatabase;
GO
ALTER FULLTEXT INDEX ON mytable ENABLE;
GO

How to check if full text indexing is enabled:

USE mydatabase;
GO
SELECT DATABASEPROPERTY('mydatabase', 'IsFullTextEnabled');
GO

Example SQL Commands

Here is an example sql statement, that sets up three columns of a table for fulltext indexing, and sorts out the change tracking state, and starts a full crawl:

use mydatabase
exec sp_fulltext_database 'enable'
go
exec sp_fulltext_catalog 'mycatalog', 'create'
go
exec sp_fulltext_table 'mytable', 'create', 'cmscatalog', 'PK_document'
go
exec sp_fulltext_column 'mytable', 'mycolumn', 'add', '2057'
exec sp_fulltext_column 'mytable', 'anothercolumn', 'add', '2057'
exec sp_fulltext_column 'mytable', 'onemorecolumn', 'add', '2057'
go
exec sp_fulltext_table 'mytable', 'activate'
go
EXEC sp_fulltext_table 'mytable', 'Start_background_updateindex';
go
exec sp_fulltext_table 'mytable', 'start_full'
go

Conclusion

SQL Server 2005 Express offers an excellent facility to allow searching for text in your database. It is both fast and reliable, and uses some clever "language aware" features that would be hard to implement any other way.