Distinct

Vote:
 

Is there any way of searching for data using sql-like 'distinct' on a specific field?

Lets say I have data structured like this:

Product: Red shirt
Size: S
Code: ABC_001
Style: ABC

Product: Red shirt
Size: M
Code: ABC_001
Style: ABC

Product: Blue shirt
Size: S
Color: Blue
Style: ABC
Code: ABC_002

Product: Blue shirt
Size: L
Color: Blue
Style: ABC
Code: ABC_002

I want to retrieve products by "Code", but only get one per "Code" - in other words "SIze" is not relevant. 
How can I do that using a Find query?

#71054
May 08, 2013 9:41
Vote:
 

Hi Mari,

I see two possibilities:

1: Use facets. Instead of caring about the search results you retrieve a facet for "Product" (or  ProductId).

2: Index objects that are already grouped.

Do you think either of those may solve your problem?

#71055
May 08, 2013 9:49
Vote:
 

Some background for anyone interested can be found here. The gist of it is that distinct would require something called field collapsing and while technically possible it brings some problems with regards to performance and scaling. Therefor it's often better to either use facets or index the same original data multiple times as the number of documents in the index have very little impact on performance.

#71056
May 08, 2013 9:55
Vote:
 

I sort of have it working now, but that is by grouping by Code after the search result is retrieved. But it will be better to only get the relevant items form Find, and not have to filter afterwards. 

I think the facet aproach is promising, but I worry that I need more data than the facet key. I'll give it a try and come back to you.

#71057
May 08, 2013 10:11
Vote:
 

The facet approach will be a bananazillion times faster than any resultset filtering. My guess is that you also might be able to construct a facet field value in a way that you can parse it afterwards like: name:somevalue:someothevalue as the some* values should be the same for every item in the group (othervise they wouldn't be distinct).

#71065
May 08, 2013 11:33
Vote:
 

Hi Mari,

With assumption that above resulset is of variants and catalog structure is like this.

Product 1
...Variant 1

...Variant2
Prodcut 2
...Variant3

...Variant4

You can add MetaDataType 'Product' in your search criteria to get back the results for products only and in result will be uniques ids for products.

Regards
Khurram

 

#71088
May 08, 2013 21:52
Vote:
 

Hi Khurram, 

Is this approach still supported in Find 8.X. I want to index Products and Variants with find but then I want Find to search through all products and variants but only return the matching distinct parent product objects. 

An example code sniffet will be appreciated. 

Thanks,

Syed

#117056
Feb 10, 2015 19:28
Vote:
 

I am also interested to hear what is possibly based on Syed's post.  I have data on the variant level that I want to be faceted on, but I want to show the counts for the facets based on a distinct of the parent products that are returned.  

Another possibility is if I could do the search based on the sku level, but then generate out the counts for the facets based on a distinct property of the results.  So if I had 3 variants that belong to the same product, I could do a search that may perhaps match 2 of the 3 variants, but because some parent id is the same, only return a count of 1 for those sets of skus.

#170299
Oct 26, 2016 1:04