Hide menu Last updated: Oct 27 2016
Area: Episerver CMS Applies to versions: 10 and higher
Other versions:

About Episerver full-text search client

The Full-Text Search Client (FTS Client) is a Framework component offering an API for adding, updating and removing searchable items in a full-text search index. The component itself acts as a REST client, communicating over HTTP using Atom Syndication Format with extensions described in this document. The Episerver FTS Client API is shipped together with the Episerver FTS Service which is a WCF REST service on top of an unmodified version of the open source search engine Lucene .NET. Episerver FTS is an attempt to index any kind of data including CMS page data. Content in files are indexed, the installed Ifilters will decide which files are included.

Assemblies and namespaces

The EPiServer.Search assembly contains the following namespaces:

  • EPiServer.Search contains core classes for the EPiServer FTS Client API, most notably are the IndexRequestItem, IndexResponseItem and SearchHandler classes.
  • EPiServer.Search.Configuration contains configuration classes for the EPiServer FTS Client API.
  • EPiServer.Search.Filter contains classes to support providers for filtering search results.
  • EPiServer.Search.Queries contains classes for querying the index service. Built-in queries are found in the EPiServer.Search.Queries.Lucene namespace.
  • EPiServer.Search.Data contains classes for integration with the Dynamic Data Store which is used for the index request queue.

Updating the index

You can modify the index as follows:

  • Creating an IndexRequestItem specifying the unique (within the index) item Id and IndexAction (add, update, remove).
  • Populating it with searchable data by assigning properties.
  • Calling the SearchHandler.Instance.UpdateIndex method, passing the IndexRequestItem.

The Episerver FTS Client requests are asynchronous, meaning that the request is immediately enqueued, and later dequeued in a separate thread in a configurable time interval or when a queue flush is explicitly requested. The queue items are added to an Atom feed which is sent to the FTS service. If the request fails, the batch is kept and sent again at the next time interval until it succeeds. Because the items need to be indexed in the same order they are added, updated, or removed, any failing batch blocks subsequent batches. The feed itself also contains a feed attribute extension Version that contains the name and current version of the Episerver FTS Client so that the current indexing service can handle different versions of the client.

The SearchHandler.Instance.UpdateIndex has the following overloads:

  • void UpdateIndex(IndexRequestItem item). Updates the default configured indexing service.
  • void UpdateIndex(IndexRequestItem item, string namedIndexingService). Updates the configured namedIndexingService. The Episerver FTS Client can thus be configured to use multiple services.

IndexItemBase

The IndexItemBase is an abstract class inherited by IndexRequestItem and IndexResponseItem with the following public members:

PropertyTypeComment
AccessControlList Collection<string> A list of groups and users that have access to this item. See AccessControlList.
Authors Collection<string> A list of authors to associate with this item.
BoostFactor float The boost factor is used to explicitly set a higher rank of this item when searching.
Categories Collection<string> A list of categories to associate with this item. 
Created DateTime The date when this item was created.
Culture string The culture for this item.
DataUri Uri A Uri to where the indexing service should fetch external data that is not part of the feed, such as files.
DisplayText string The part of the searchable text that is supposed to be returned by the indexing service when getting search results.
Id string The unique identified for this item within this named index.
ItemStatus ItemStatus The status (Approved, Pending, Removed) for this item.
Metadata string Additional searchable data of “unlimited” size that is not supposed to be returned or displayed together with the search results.
ItemType string The type of this item. For example, EPiServer.Community.Blog, EPiServer.Core.PageData.
Modified DateTime The date when this item was last modified.
NamedIndex string Specifies the named index to update when multiple indexes are configured in the indexing service. See Named indexes and Multi search in the About Episerver full-text search service topic.
PublicationEnd DateTime? The expiration date for this item, after which the item is returned in searches.
PublicationStart DateTime? The start date for when this item is included in searches.
ReferenceId string A reference to a “parent” item in the index and is used when having searchable content in sub items (for example, Comments) where matches in the sub item should result in search hit for the parent item. See the ReferenceId section in the About Episerver full-text search service topic.
Title string The title of the item.
Uri Uri The Uri to use when linking the search result item to the content source.
VirtualPathNodes Collection<string>

A list of nodes that build a path under which the item can be found. Used together with trailing wildcard queries. For example, if item1 has a path of A/B and item2 has a path of A, a query like A* would result in two results (item1 and item2) while a query like A/B* would result in one result (item1). See VirtualPath in the About Episerver full-text search service topic.

Note: Any white spaces within a node are automatically removed.

IndexRequestItem

The IndexRequestItem class inherits the IndexItemBase and holds the searchable content to be indexed.

PropertyTypeComment
IndexAction IndexAction Enum for Add, Update, Remove.
AutoUpdateVirtualPath bool Boolean value the indicates if the query should affect other items whos virtual path nodes begins with the same nodes as the ones provided in the request item.

IndexResponseItem

The IndexResponseItem class inherits the IndexItemBase and holds the content returned by the indexing service.

PropertyTypeComment
Score float The score set by the indexing service for this IndexResponseItem.

Querying the index

Querying the index is done by passing an object of type IQueryExpression to SearchHandler.Instance.GetSearchResults. The IQueryExpression interface defines the property string QueryExpression which returns the textual representation of the expression to be parsed by the indexing service.

The SearchHandler.Instance.GetSearchResults has two overloads:

  • SearchResults GetSearchResults(IQueryExpression queryExpression, int page, int pageSize)searches the default configured namedIndexingService and the default namedIndex in the service for matches for the passed queryExpression.
  • SearchResults GetSearchResults(IQueryExpression queryExpression, string namedIndexingService, List<string> namedIndexes, int page, int pageSize)connects to the passed namedIndexingService and searches the list of passed namedIndexes by performing a multi search. See Named indexes and Multi searchh in the FTS Service topic.

The page and pageSizeSearchResults parameter together with the TotalHits enables paging of search results, see Paging in the FTS Service topic.

Query enums

The EPiServer.Search.Queries.Lucene namespace contains the following enums that are used by many of the query expressions:

EnumComment
Field The Field enum hold values for fields that are separately searchable in the indexing service by queries taking a Field in the constructor. For example, finding occurrences for hello world in the Field. Title and/or sv in the Field.Culture.
LuceneOperator The Operator enum hold values for AND, OR and NOT.

Implemented queries

The EPiServer.Search.Queries.Lucene namespace contains the following query expression implementations compatible with the EPiServer FTS Service:

ClassConstructorComment
AccessControlQuery LuceneOperator innerOperator Empty constructor. The property AccessControlList returns a List<string> where users and groups can be added that should have read access to the index document. Any group or user added to the query must match at least one group or user in the IndexItemBase.AccessControlList. The default innerOperator is OR.
CategoryQuery LuceneOperator innerOperator Queries the index for occurrences where there is an exact match for the category literal in the query.
CreatedDateRangeQuery DateTime start,
DateTime end,
bool inclusive
Inherits RangeQuery. Match occurrences where the Created field lies within the start and end parameters.
FieldQuery string queryExpression,
Field field
Queries the index with the passed queryExpression (typically a word or a phrase) for the passed Field. If no Field is passed the Field.Default is used. Example:

FieldQuery q = new FieldQuery("\" looking for something\"");

returns occurrences where any searchable field contains the phrase looking for something.
FuzzyQuery string word,
Field field,
float similarityFactor
Queries the index for occurrences of similar words for the passed word. The higher similarityFactor the more similar the word needs to be to create an index hit. For example, a field containing you may reconsider and this is a reconsideration match for a certain similarityFactor.
GroupQueryy LuceneOperator innerOperator Group queries do not query result in any query expression in itself but is used to group queries with an operator between then. The QueryExpressions property returns a List<IQueryExpression> where query expressions can be added with the innerOperator between them. GroupQuery is an IQueryExpression which makes it possible to combine groups and expressions with their respective inner operators recursively.
ItemStatusQuery ItemStatus status Inherits GroupQuery. Match occurances where the ItemStatus value matches any of passed ItemStatus flags.
You can pass several ItemStatus values to the constructor by using bit operands, for example, new ItemStatusQuery(ItemStatus.Approved | ItemStatus.Pending). Occurances that match ANY of the passed ItemStatus values are accepted as matches.
You should not add additional queries to the ItemStatusQuery QueryExpressions list. If you want to combine an ItemStatusQuery with another query you should combine them using a GroupQuery.
ModifiedDateRangeQuery DateTime start,
DateTime end,
bool inclusive
Inherits RangeQuery. Match occurrences where the modified field lies within the start and end parameters.
ProximityQuery String phrase,
Field field,
int distance
Queries the index for occurrences where the words in the passed phrase exist with the passed distance from each other. For example, a document with the phrase Episerver for the win would match the ProximityQuery phrase Episerver win with the passed distance of 2.
RangeQueryy String start,
string end,
Field field,
bool inclusive
Queries the index for occurrences within the passed literal start and end. For example, 20010202, 20030303, Field.Modified, true match index documents with the Field.Modified set between (and included) the start and end.
TermBoostQueryy String phrase,
Field field,
float boostFactor
Queries the index with a boost factor for the field, giving occurrences for the phrase in the field higher scores/relevance.
VirtualPathQuery () Empty constructor. The property VirtualPathNodes returns a List<string> where path nodes can be added in the order they appear from the root. The resulting path matches any index items with a path that starts with the query path.

SearchResults

The SearchResults class holds data from an indexing service response and is assigned internally when getting IndexResponseItems back from the service:

PropertyTypeComment
IndexResponseItems List<IndexResponseItem> Gets a list of IndexResponseItems returned by the indexing service
TotalHits int Gets the number of the total matching items in the indexing service.
Version string Gets the current indexing service name and version.

Filtering search results

The Episerver FTS Client lets you plug in one or many filter providers to explicitly include or exclude specific items from the results by overriding the SearchResultFilter.SearchResultFilterProvider.Filter(IndexResponseItem item) method and adding a provider to the configuration. The SearchResultFilter return type is an Enum with values for Include, Exclude and NotHandled. Where the IndexResponseItem is not handled by this provider, the system should forward it to the next configured provider. If the IndexResponseItem is not handled by any configured provider, the configured defaultInclude behavior is used.

Atom and extensions

The standard Atom format is extended with both attributes and elements with the default EPiServer.Search.IndexingService namespace to map the data in the IndexRequestItem and IndexResponseItem as shown in the following table:

PropertyNameExtension
IndexItemBase.AccessControlList ACL Element Extension
IndexItemBase.Authors Authors No
IndexItemBase.BoostFactor BoostFactor Attribute extension
IndexItemBase.Categories Categories No
IndexItemBase.Created PublishDate No
IndexItemBase.Culture Culture Attribute extension
IndexItemBase.DataUri DataUri Attribute extension
IndexItemBase.DisplayText Content No
IndexItemBase.Id Id No
IndexItemBase.ItemType Type Attribute extension
IndexItemBase.Metadata Metadata Element extension
IndexItemBase.Modified LastUpdatedTime No
IndexItemBase.NamedIndex NamedIndex Attribute extension
IndexItemBase.ReferenceId ReferenceId Attribute extension
IndexItemBase.Title Title No
IndexItemBase.Uri BaseUri No
IndexItemBase.Version Version Attribute extension
IndexItemBase.VirtualPath VirtualPath Element extension
IndexRequestItem.IndexAction IndexAction Attribute extension
IndexRequestItem.AutoUpdateVirtualPath AutoUpdateVirtualPath Attribute extension
IndexResponseItem.Score Score Attribute extension

 

Example of a typical update index request

Request

POST http://localhost.:8072/IndexingService/update/?accesskey=accesskey1 
     HTTP/1.1
     Content-Type: f
     Host: localhost.:8072
     Content-Length: 773
     Expect: 100-continue
<?xml version="1.0" encoding="utf-8"?>
   <feed p1:Version="EPiServer.Search v.6.1.28.0" 
         xmlns:p1="EPiServer.Search.IndexingService" xmlns="http://www.w3.org/2005/Atom">
         <title></title>
         <id>uuid:6d85bc21-020e-4058-994c-090061f9d89c;id=12</id>
         <updated>2010-04-26T20:02:02Z</updated>
         <entry p1:IndexAction="add" p1:BoostFactor="1" p1:Type="" p1:Culture="" 
         p1:NamedIndex="testindex2" p1:ReferenceId="">
         <id>1</id>
         <title></title>
         <published>2010-04-26T22:02:02+02:00</published>
         <updated>2010-04-26T22:02:02+02:00</updated>
         <content></content>
         <p1:Metadata xmlns:p1="EPiServer.Search.IndexingService"></p1:Metadata>
         <ACL xmlns="EPiServer.Search.IndexingService"></ACL>
         <VirtualPath xmlns="EPiServer.Search.IndexingService"></VirtualPath>
       </entry>
   </feed>
Response
HTTP/1.1 200 OK
Content-Length: 0
Server: Microsoft-HTTPAPI/1.0
Date: Mon, 26 Apr 2010 20:02:02 GMT

 

Example of a typical get search results response

Request

GET http://localhost.:8072/IndexingService/search/?q=EPISERVER_SEARCH_ID%3a(1)&namedindexes=&offset=0&limit=20&format=xml&accesskey=accesskey1 
            HTTP/1.1
            Content-Type: application/xml
            Host: localhost.:8072
Response
HTTP/1.1 200 OK
Content-Length: 638
Content-Type: application/xml; charset=utf-8
Server: Microsoft-HTTPAPI/1.0
Date: Mon, 26 Apr 2010 20:02:02 GMT
<feed a:TotalHits="1" a:Version="EPiServer.Search v.1.0.517.236" xmlns="http://www.w3.org/2005/Atom" xmlns:a="EPiServer.Search.IndexingService">
<title/>
<id>uuid:6d85bc21-020e-4058-994c-090061f9d89c;id=11</id>
<updated>2010-04-26T20:02:02Z</updated>
<entry xml:base="http://www.google.com/" a:Culture="sv" a:Type="EPiServer.Search.IndexItem, EPiServer.Search" a:Score="0.3068528" a:DataUri="" a:BoostFactor="1" a:NamedIndex="default">
<id>1</id>
<title>Header test</title>
<published>2010-04-26T22:02:00+02:00</published>
<updated>2010-04-26T22:02:00+02:00</updated>
<content>Body test</content>
</entry>
</feed>

 

Example of a typical reset index request

Request

POST http://localhost.:8072/IndexingService/reset/?namedindex=default&accesskey=accesskey1 
     HTTP/1.1
     Content-Type: application/xml
     Host: localhost.:8072
     Content-Length: 0

Response

HTTP/1.1 200 OK
Content-Length: 0
Server: Microsoft-HTTPAPI/1.0
Date: Mon, 26 Apr 2010 20:02:00 GMT

 

Example of a typical get named indexes request

Request

GET http://localhost.:8072/IndexingService/namedindexes/?accesskey=accesskey1 
    HTTP/1.1
    Content-Type: application/xml
    Host: localhost.:8072
    Connection: Keep-Alive

Response

HTTP/1.1 200 OK
Content-Length: 1048
Content-Type: application/xml; charset=utf-8
Server: Microsoft-HTTPAPI/1.0
Date: Mon, 26 Apr 2010 20:02:00 GMT
<feed xmlns="http://www.w3.org/2005/Atom">
<title/>
<id>uuid:6d85bc21-020e-4058-994c-090061f9d89c;id=1</id>
<updated>2010-04-26T20:02:00Z</updated>
<entry>
<id>uuid:6d85bc21-020e-4058-994c-090061f9d89c;id=2</id>
<title>default</title>
<updated>2010-04-26T20:02:00Z</updated>
</entry>
<entry>
<id>uuid:6d85bc21-020e-4058-994c-090061f9d89c;id=3</id>
<title>testindex2</title>
<updated>2010-04-26T20:02:00Z</updated>
</entry>
<entry>
<id>uuid:6d85bc21-020e-4058-994c-090061f9d89c;id=4</id>
<title>testindex3</title>
<updated>2010-04-26T20:02:00Z</updated>
</entry>
</feed>

 

Related topics

Comments