Indexing CMS 6 R2

Area: Optimizely Search & Navigation

ARCHIVED This content is retired and no longer maintained. See the latest version here.

How it works

Given that we have referenced the Truffler.EPiServer.Cms assembly in our Episerver CMS project published pages will be automatically indexed. Pages are also reindexed, or deleted from the index, when they are saved, moved or deleted. Each language version is indexed as a separate document.

Indexing module

The indexing module is an IInitializableModule that handles all DataFactory evented indexing. Whenever a page is saved, published, moved or deleted it triggers an index request to the PageIndexer.Instance object which then will handle the actual indexing.

PageIndexer.Instance

The PageIndexer.Instance singleton located in the Truffler.EPiServer.CMS namespace adds support for indexing PageData and UnifiedFile objects. It allows for re-indexing the entire PageTree and specific language branches and individual pages and files. When indexing a PageData object all page files are also indexed.

Invisible mode

One core feature of the PageIndexer is its ability to work in invisible mode when indexing objects passed by the IndexingModule. When in invisible mode all indexing will be handled in a separate thread and not in the DataFactory event thread. This way indexing wont delay the DataFactory event thread and therefore not the save/publish action. This is the default behavior and can be overridden by setting PageIndexer.Instance.Invisible to false.

Conventions

The PageIndexer.Instance supports a set of conversions for tweaking how indexing is executed. Examples of such conventions are controlling which pages are indexed (described below) and dependencies between pages.

Customizing pages to be indexed

It is possible to set control which pages that should be indexed by passing a verification expression to the ShouldIndex convention. By default all published pages are indexed.

For example, if we do not want to index the LoginPageType, this can be done by simply passing a verification expression that validates to false for the LoginPageType to the ShouldIndex convention, preferably during application startup such as in the Application_Start method in global.asax.

C#

//using EPiServer.Find.Cms.Conventions;

PageIndexer.Instance.Conventions
  .ForInstancesOf<LoginPageType>()
  .ShouldIndex(x => false);

To override the default setting, add a convention for PageData and add the appropriate verification expression.

C#

//using EPiServer.Find.Cms.Conventions;
PageIndexer.Instance.Conventions
  .ForInstancesOf<PageData>()
  .ShouldIndex(x => true);

VPP indexing

The Find integration for Episerver CMS does not index any files by default. However, both indexing of page files and files in other VPPs (virtual path providers) can be enabled. To enable indexing of page files, place the below code in an initialization module or in the application_start event handler in global.asax.cs.

C#

PageIndexer.Instance.Conventions.EnablePageFilesIndexing();

To enable automatic indexing of other virtual path providers set the ShouldIndexVPPConvention property on the FileIndexer conventions to a convention that returns true for the VPPs that should be indexed. For instance, to index files in all VPPs that are visible in the file manager, place the below code in an initialization module or in the application_start event handler in global.asax.cs.

C#

FileIndexer.Instance.Conventions.ShouldIndexVPPConvention 
  = new VisibleInFilemanagerVPPIndexingConvention();

For known files such as Word-documents and PDF:s, the actual content is indexed and for unknown files the files meatdata (from UnifiedFile), such as name, summary and path is indexed. Page files are only indexed if the page is marked as indexed by the conventions.

Changing the name or namespaces of page types

When changing the name or namespaces of your page types, there will be a mismatch between the types already in your index and your new page types. This might cause errors when querying as the API cannot resolve the right page type from what is reported from the index. To solve this you have to reindex all pages, by the scheduled plugin, to have your new page types reflected in the index.

Do you find this information helpful? Please log in to provide feedback.

Last updated: Nov 16, 2015

Try our conversational search powered by Generative AI!