Views: 1656
Number of votes: 4
Average rating:

Improving synonyms and overall search experience

Is your Search & Navigation (Find) implementation affected by the limitations of the current synonym functionality and/or would you like to improve upon the overall search experience? 

Check out https://github.com/episerver/EPiServer.Labs.Find.Toolbox
Setup and configuration is simple. Install NuGet Package and then there are a couploe of lines of code to get it working.

Toolbox features

  • An improved synonym implementation
  • MinimumShouldMatch
  • MatchPhrase and MatchPrefixPhrase
  • FuzzyQuery and WildcardQuery

All can be used together or independently and depends on the .For() call for the original query

Toolbox also comes with support for Elastic Search's MinimumShouldMatch. With MinimumShouldMatch it's possible to set or or more conditions for how many terms (in percentage and absolutes) should match. If you specify 2<60% all terms up to 2 terms will be required to match. More than 2 terms 60% of the terms are required to match. If you specify 2 all terms up to 2 terms will be required to match. This is prefered over using purely OR or AND where you will either get too many hits (OR) or no hits (AND). MinimumShouldMatch() has to be called before calling UsingImprovedSynonyms() to be utilized.

To improve relevance and search experience even further support for Elastic Search's MatchPhrase, MatchPrefixPhrase, FuzzyQuery and WildcardQuery has been added.

PhraseBoost() and PhrasePrefixBoost() boosts the relevance for exact phrase matches and phrase matches in the beginning of fields.

FuzzyMatch() finds terms even if the wording is not quite right. WildcardMatch() find terms even if they are not completed or are part of another word. The two latter are only applies to terms longer than 2 characters. Wildcard is only added to the right. Wildcard matches gets a negative boost.

Feedback and input are welcome and don't hesitate to contribute if you'd like.

Please note that this project is not officially supported by Episerver just like most EPiServer.Labs projects.
Should be considered stable and is currently used in production environments.

UPDATE 2020-09-08

Bugfixes and improvements and new version 1.0.9.

UPDATE 2020-09-03

Project renamed to Find Toolbox @ https://github.com/episerver/EPiServer.Labs.Find.Toolbox
Now with new features to improve search relevance and overall search experience even further. 

Jun 24, 2020

Tomas Hensrud Gulla
( By Tomas Hensrud Gulla, 6/25/2020 8:41:58 AM)

Can this be made available at https://nuget.episerver.com/ ?

Tomas Hensrud Gulla
( By Tomas Hensrud Gulla, 6/25/2020 8:52:05 AM)

I have tried (both including the source code, and installing the nuget package), but are experiencing a problem.

My code:

SearchClient.Instance.UnifiedSearch(language).For(model.Query).UsingSynonymsImproved();

Build error:
Error CS1061 'IQueriedSearch<ISearchContent, QueryStringQuery>' does not contain a definition for 'UsingSynonymsImproved' and no extension method 'UsingSynonymsImproved' accepting a first argument of type 'IQueriedSearch<ISearchContent, QueryStringQuery>' could be found (are you missing a using directive or an assembly reference?)

I have both the reference and the using directive.

dada
( By dada, 6/25/2020 11:27:50 AM)

Hi Tomas,

Please make sure you have. 

using EPiServer.Find.Cms;

If I don't have it I get the same error message. I have updated the README to reflect this.

dada
( By dada, 6/25/2020 1:11:45 PM)

Tomas, I will discuss with dev team if we can make it available from nuget.episerver.com after the summer.

Tomas Hensrud Gulla
( By Tomas Hensrud Gulla, 6/26/2020 9:50:08 AM)

The problem was not the using-statement, but the lack of suppoert for .NET 4.6.1.

Thanks for fixing  it so quick! Excellent work!

Tomas Hensrud Gulla
( By Tomas Hensrud Gulla, 6/26/2020 9:50:25 AM)

The problem was not the using-statement, but the lack of support for .NET 4.6.1.

Thanks for fixing  it so quick! Excellent work!

Michael Clausing
( By Michael Clausing, 6/26/2020 6:22:56 PM)

This looks very similar to what we talked about doing for a client. Thanks!

Mari Jørgensen
( By Mari Jørgensen, 7/6/2020 5:15:11 PM)

@dada This should work for filtered search as well?

I have code similar to this (simplified a bit):

SearchClient.Instance.Search<EntryContentBase>(Language.Norwegian)
.For(searchQuery)
.InField(x)
.InField(y)
.InField(z)
.MinimumShouldMatch("2<60%")
.UsingSynonymsImproved()
.ApplyBestBets()

I cannot get any results for multi words synonyms.

dada
( By dada, 7/8/2020 8:02:23 PM)

@Mari if you could share index details and JSON for that search I will look into it. Send it directly to my email daniel.dahlin@episerver.con

Mari Jørgensen
( By Mari Jørgensen, 8/5/2020 8:18:20 AM)

@dada: Back from vaction. Email is coming in next couple of minutes.

Eric Petersson
( By Eric Petersson, 8/6/2020 7:33:08 AM)

I have the same issues when trying to get hits on bidirectional synonyms via the UsingImprovedSynonyms()

dada
( By dada, 8/6/2020 10:14:38 AM)

Hi @Mari @Eric

By the looks of JSON for search shared by Mari it looks like the match is never made which likely because the local synonym list is cached and which lacks the additional synonym. Also empty lists are cached to avoid spamming the service with requests.

Testing multi term synonyms and bidirectional synonyms works locally.

By default the list is cached for an hour. When testing you could set it to something lower like .UsingImprovedSynonyms(TimeSpan.FromSeconds(10))

Mari Jørgensen
( By Mari Jørgensen, 8/6/2020 11:26:09 AM)

I can confirm that caching is the issue! Went back to the branch and tested once more, and now it works! Thanks, Daniel!

Please login to comment.