Try our conversational search powered by Generative AI!

Loading...
Area: Optimizely Search & Navigation
ARCHIVED This content is retired and no longer maintained. See the latest version here.

Recommended reading 

 

This topic describes the indexing in an integration with EPiServer 7 CMS. Since we referenced the EPiServer.Find.Cms assembly in our EPiServer CMS project, published content is automatically indexed. Content is also reindexed, or deleted from the index, when it is saved, moved, or deleted. Each language version is indexed as a separate document.

Indexing Module

The indexing module is an IInitializableModule that handles all DataFactory evented indexing. Whenever content is saved, published, moved or deleted, the module triggers an index request to the ContentIndexer.Instance object, which then handles the actual indexing.

ContentIndexer.Instance

The ContentIndexer.Instance singleton, located in the EPiServer.Find.Cms namespace, adds support for indexing IContent and UnifiedFile objects. It allows for re-indexing the entire PageTree as well as specific language branches and individual content and files. When indexing an IContent object, all page files are also indexed.

Invisible mode

A core feature of the ContentIndexer is its ability to work in invisible mode when indexing objects passed by the IndexingModule. When in invisible mode, all indexing is handled in a separate thread and not in the DataFactory event thread. This way, indexing won't delay the DataFactory event thread and, therefore, not the save/publish action. This is the default behavior. To override it, set ContentIndexer.Instance.Invisible to false.

Conventions

The ContentIndexer.Instance supports a set of conventions for tweaking how indexing is executed. Examples of such conventions are controlling which pages are indexed (described below) and dependencies between pages.

Customizing pages to be indexed

It is possible to control which content should be indexed by passing a verification expression to the ShouldIndex convention. By default, all published content is indexed.

For example, if you do not want to index a page type (such as the LoginPageType), pass a verification expression that validates to false for the PageType to the ShouldIndex convention, preferably during application startup, such as in the Application_Start method in global.asax.

C#
//using EPiServer.Find.Cms.Conventions;

ContentIndexer.Instance.Conventions
  .ForInstancesOf<LoginPageType>()
  .ShouldIndex(x => false);

To override the default setting, add a convention for PageData and add the appropriate verification expression.

C#
//using EPiServer.Find.Cms.Conventions;
ContentIndexer.Instance.Conventions
  .ForInstancesOf<PageData>()
  .ShouldIndex(x => true);

To exclude a property from being indexed, use the JsonIgnore attribute or add a convention for it.

C#
//using EPiServer.Find.Cms.Conventions;
ContentIndexer.Instance.Conventions
  .ForInstancesOf<PageData>()
  .ExcludeField(x => x.ACL)
C#
[JsonIgnore]
public DateInterval Interval { get; set; }

File indexing

Using IContentMedia, files based on these MIME types are indexed by default.

  • "text/plain"
  • "application/pdf"
  • "application/postscript"
  • "application/msword"
  • "application/vnd.openxmlformats-officedocument.wordprocessingml.document"

Changing the name or namespaces of page types

When changing the name or namespaces of your page types, there is a mismatch between the types already in your index and the new page types. This might cause errors when querying, as the API cannot resolve the right page type from what is reported from the index. To solve this, reindex all pages, by the scheduled plugin, to add the new page types to the index.

Do you find this information helpful? Please log in to provide feedback.

Last updated: Jun 10, 2014

Recommended reading