Loading...
Area: Episerver Find
Applies to versions: 12 and higher
Other versions:

Indexing

This topic describes how indexing works in an integration with Episerver Find and Episerver CMS.

How it works

SInce EPiServer.Find.Cms assembly is referenced in the Episerver CMS project, published content is automatically indexed. Content is also reindexed, or deleted from the index, when it is saved, moved, or deleted. Each language version is indexed as a separate document.

Indexing module

The indexing module is an IInitializableModule that handles all DataFactory evented indexing. If content is saved, published, moved or deleted, that event triggers an index request to the ContentIndexer.Instance object, which handles the indexing.

ContentIndexer.Instance

The ContentIndexer.Instance singleton (located in the EPiServer.Find.Cms namespace) adds support for indexing IContent and UnifiedFile objects. It allows for re-indexing the entire PageTree, specific language branches, and individual content and files. When indexing an IContent object, all page files are also indexed.

Invisible mode

A core feature of the ContentIndexer is its ability to work in invisible mode when indexing objects passed by the IndexingModule. In invisible mode, indexing is handled in a separate thread, not in the DataFactory event thread. This way, indexing does not delay the DataFactory event thread and the save/publish action. This is the default behavior. To override it, set ContentIndexer.Instance.Invisible to false.

Conventions

The ContentIndexer.Instance supports a set of conversions for tweaking how indexing is executed. Examples of these conventions are

  • controlling which pages are indexed (described below)
  • dependencies between pages

Customizing pages to be indexed

You can control which content to index by passing a verification expression to the ShouldIndex convention. By default, all published content is indexed.

For example, if you do not want to index the LoginPageType page type, pass to the ShouldIndex convention a verification expression that validates to false for the LoginPageType, preferably during application startup, such as in the Application_Start method in global.asax.

//using EPiServer.Find.Cms.Conventions;

ContentIndexer.Instance.Conventions
  .ForInstancesOf<LoginPageType>()
  .ShouldIndex(x => false);

To override the default setting, add a convention for PageData and add the appropriate verification expression.

//using EPiServer.Find.Cms.Conventions;
ContentIndexer.Instance.Conventions
  .ForInstancesOf<PageData>()
  .ShouldIndex(x => true);

You can exclude a property from being indexed by using either the JsonIgnore attribute or adding a convention for it.

//using EPiServer.Find.Cms.Conventions;
ContentIndexer.Instance.Conventions
  .ForInstancesOf<PageData>()
  .ExcludeField(x => x.ACL)
[JsonIgnore]
public DateInterval Interval { get; set; }

File indexing

Using IContentMedia, files based on the following MIME types are indexed by default:

  • "text/plain"
  • "application/pdf"
  • "application/postscript"
  • "application/msword"
  • "application/vnd.openxmlformats-officedocument.wordprocessingml.document"

Changing the name or namespaces of page types

If you change the name or namespaces of page types, a mismatch occurs between types already in your index and the new page types. This might cause errors when querying, since the API cannot resolve the right page type from what is reported from the index. To solve this and have the new page types reflected in the index, reindex all pages by the scheduled plugin.

Last updated: Oct 31, 2016