Try our conversational search powered by Generative AI!

Dan Matthews
May 5, 2016
  2214
(3 votes)

Vulcan fires up its language and commerce engines

It’s been over a week since I first launched the alpha of Vulcan, the lightweight Elasticsearch client for Episerver. Since then I’ve been working hard to test it, stretch it and improve it. Some of it has simply ben bug fixes and simplifications but most of the effort has gone into handling analysis of textual content. The issue is that we want to analyse ‘free text’ content as language, but things like product codes shouldn’t be analyzed at all. Fortunately, Elasticsearch has a great feature called Multi-Fields. This enables us to deal with the field as non-analyzed, but also analyze and store a copy of those fields so that we can do free-text queries against it. So what has changed generally in Vulcan, and how do you use the new language handling?

Before we start, just one important note. I recommend you use an Elasticsearch 2.x cluster. I found that the 1.x clusters I was testing on didn’t work so nicely with the latest version of NEST, which kind of expects a 2.x cluster. I did get the Vulcan core running fine against a 1.x index, but I can’t guarantee that your queries will work as expected. You may get some random 400 bad requests as the NEST client creates 2.x compatible queries and tries to pass them to the 1.x index. For that reason, if you are testing then I suggest you use a 2.x cluster. I found a free one you can use in the cloud from Bonsai, or you can of course host your own.

Other than that, the most significant change is something that you won’t see at first glance. I’ve split the index across multiple language-based indexes. This is Elasticsearch recommended best practice, so it seemed the right thing to do. When you now run the Vulcan Index Content scheduled job, you’ll see an index created per language, with the start of the index the name you set in the web.config. So, for example, if you set a Vulcan index name of ‘vulcan’, then you might see indexes called ‘vulcan_en’, ‘vulcan_de’, ‘vulcan_invariant’ and so forth. That last one – the invariant index – is particularly interesting as it’s where all the content is stored that is not localizable. You can get a handle to it by getting a Vulcan client for CultureInfo.InvariantCulture:

var client = VulcanHandler.Service.GetClient(CultureInfo.InvariantCulture);
 
Note that passing in null is NOT the same as passing in the invariant culture. Passing in null is just a shortcut to whatever your current UI culture is. Note also that I’ve changed the Client property of the VulcanHandler to a GetClient() method so that you can specify what culture you want to handle with your calls. Most of the other calls you make with Vulcan (indexing, deleting etc.) are now also overloaded to take a CultureInfo parameter (or null for the current UI culture).
 
So once we have our client, how do we run a query? For non-free-text queries (such as term queries or any queries on non-string fields) you just query like you always would. If you want to do a free-text based query, you need to specify that you want the query to run against the analyzed version of the fields. In practice, that means adding one little call to our fluent query DSL. For example, the following query is from my version of the Alloy search page and looks for the query as free-text, along with some hit highlighting and aggregation:
 
model.ContentHits = VulcanHandler.Service.GetClient().SearchContent<IContent>(d => d
.Query(query => query.SimpleQueryString(sq => sq.Fields(fields => fields.Field("*.analyzed")).Query(q)))
.Highlight(h => h.Encoder("html").Fields(f => f.Field("*")))
.Aggregations(agg => agg.Terms("types", t => t.Field("_type"))));

You’ll notice the *.analyzed instruction on the query that tells Elasticsearch to look at the analyzed version of the fields. You can specify exact fields if you prefer (such as mainBody.analyzed) but in most circumstances you are most likely to run a free-text query against all fields. So when would you use a non-free-text query on a string field? Usually that would be when you are doing filters and aggregations. For example, in a commerce environment you may well want to filter based on market. Lets say that we want to aggregate the prices and then show them on the front end as a facet. We would want to filter the prices to the current market first.
 
Let’s look at this in two parts. Firstly, lets get the price indexed. By default, there’s no property on a variation we have for that, so we’ll add one. In theory you could use any kind of object to hold that price, but for clarity I’m going to use a little construct:
 
public class PriceConstruct
{
public string MarketId { get; set; }

public Money Price { get; set; }
}

Now we can get a list of these by adding a property called Price to the variant type:
 
public IEnumerable<PriceConstruct> Price
{
get
{
var prices = new List<PriceConstruct>();

foreach (var market in ServiceLocator.Current.GetInstance<IMarketService>().GetAllMarkets())
{
var variantPrices = this.GetPrices(market.MarketId, Mediachase.Commerce.Pricing.CustomerPricing.AllCustomers);

if (variantPrices != null)
{
foreach (var price in variantPrices)
{
if (price.MinQuantity == 0 && price.CustomerPricing == Mediachase.Commerce.Pricing.CustomerPricing.AllCustomers) // this is a default price
{
prices.Add(new PriceConstruct() { MarketId = market.MarketId.Value, Price = price.UnitPrice });

break;
}
}
}
}

return prices;
}
}

All this does is loops the prices and tries to get the default prices for the various markets. You could of course make this more robust like checking currency, but this is just a simple example. Now that we have this property, when we run our index job it will get persisted into Elasticsearch. We can now query it with Vulcan something like this (this is from my Quicksilver demo that I’ve updated to use Vulcan):
 
model.SearchResponse = VulcanHandler.Service.GetClient().SearchContent<EPiServer.Reference.Commerce.Site.Features.Product.Models.FashionVariant>(
q => q.Aggregations(a => a
.Filter("current_market", cm => cm
.Filter(f => f
.Term(p => p
.Price.First().MarketId, CurrentMarket.Service.GetCurrentMarket().MarketId.Value))
.Aggregations(agg => agg
.Terms("prices", t => t
.Field(fld => fld.Price.First().Price.Amount))))));

In this particular case, we are using a filter aggregation to narrow down to the current market, and then using a child aggregation to get the prices. In reality, you probably wouldn’t use a Terms aggregation for prices. You’d probably use a Range aggregation.
 
Lastly, just some housekeeping. Some hosted clusters require a username and password to access it, so I’ve added support for this to the web.config. For example, here is my configuration talking to Bonsai:
 
<add key="VulcanUrl" value="https://vulcancluster-452277433331.eu-west-1.bonsai.io/" />
<add key="VulcanUsername" value="jkda99asdk" />
<add key="VulcanPassword" value="r9088fsaff" />
<add key="VulcanIndex" value="vulcan_quicksilverdemo" />

I’m very open to ideas and suggestions on how to drive Vulcan forward, particularly on Episerver Commerce projects. I’m thinking of trying to somehow generalise the price management, maybe do market handling in a nice way too. If you have any feedback, do let me know on here or on my email at firstname.lastname@episerver.com.

DISCLAIMER: This project is in no way connected with or endorsed by Episerver. It is being created under the auspices of a South African company and is entirely separate to what I do as an Episerver employee.

May 05, 2016

Comments

Jonas Peterson
Jonas Peterson Sep 22, 2016 10:37 AM

Must say that this was exactly what I was looking for! Keep up the good work!

Please login to comment.
Latest blogs
Optimizely and the never-ending story of the missing globe!

I've worked with Optimizely CMS for 14 years, and there are two things I'm obsessed with: Link validation and the globe that keeps disappearing on...

Tomas Hensrud Gulla | Apr 18, 2024 | Syndicated blog

Visitor Groups Usage Report For Optimizely CMS 12

This add-on offers detailed information on how visitor groups are used and how effective they are within Optimizely CMS. Editors can monitor and...

Adnan Zameer | Apr 18, 2024 | Syndicated blog

Azure AI Language – Abstractive Summarisation in Optimizely CMS

In this article, I show how the abstraction summarisation feature provided by the Azure AI Language platform, can be used within Optimizely CMS to...

Anil Patel | Apr 18, 2024 | Syndicated blog

Fix your Search & Navigation (Find) indexing job, please

Once upon a time, a colleague asked me to look into a customer database with weird spikes in database log usage. (You might start to wonder why I a...

Quan Mai | Apr 17, 2024 | Syndicated blog