I was just asked the following question by a student:  When should I do a full crawl?

Here is the answer!

Reasons to do a full crawl

Reasons for a search services administrator to do a full crawl include:

  • One or more hotfix or service pack was installed on servers in the farm. See the instructions for the hotfix or service pack for more information.

  • An SSP administrator added a new managed property.

  • To re-index ASPX pages on Windows SharePoint Services 3.0 or Office SharePoint Server 2007 sites.

    Note:

    The crawler cannot discover when ASPX pages on Windows SharePoint Services 3.0 or Office SharePoint Server 2007 sites have changed. Because of this, incremental crawls do not re-index views or home pages when individual list items are deleted. We recommend that you periodically do full crawls of sites that contain ASPX files to ensure that these pages are re-indexed.

  • To detect security changes that were made on a file share after the last full crawl of the file share.

  • To resolve consecutive incremental crawl failures. In rare cases, if an incremental crawl fails one hundred consecutive times at any level in a repository, the index server removes the affected content from the index.

  • Crawl rules have been added, deleted, or modified.

  • To repair a corrupted index.

  • The search services administrator has created one or more server name mappings.

  • The account assigned to the default content access account or crawl rule has changed.

The system does a full crawl even when an incremental crawl is requested under the following circumstances:

  • An SSP administrator stopped the previous crawl.

  • A content database was restored from backup.

    Note:

    If you are running the Infrastructure Update for Microsoft Office Servers, you can use the restore operation of the stsadm command-line tool to change whether a content database restore causes a full crawl.

  • A farm administrator has detached and reattached a content database.

  • A full crawl of the site has never been done.

  • The change log does not contain entries for the addresses that are being crawled. Without entries in the change log for the items being crawled, incremental crawls cannot occur.

  • The account assigned to the default content access account or crawl rule has changed.

  • To repair a corrupted index.

    Depending upon the severity of the corruption, the system might attempt to perform a full crawl if corruption is detected in the index.

You can adjust schedules after the initial deployment based on the performance and capacity of servers in the farm and the servers hosting content.

Source:  http://technet.microsoft.com/en-us/library/cc262926.aspx#section1

Serge

Advertisements