Elasticsearch Product Characteristics – How to Use Elasticsearch for Product Characteristics

database-recommendationelasticsearchfull-text-search

I have a small question about elasticsearch. Let's say I have a database with products to sell, like an ecommerce.

I know that elasticsearch can help me to find the best product with, for instance a full text search on the description and sorting by score.

Let's say someone is looking for sport shoes, and I don't sell any, but I do sell shoes.
What would be the best way to show the user I don't have any? The full text search will for sure give me everyday shoes in the results, since the word "shoes" is in the query, but I would prefer the user to know I don't sell any rather than seeing products that don't match his query.

Is it as simple as setting a score threshold and playing with it until having good results, or are there specific settings?

I'm also interested in generic answers having nothing to do with elasticsearch.

Thanks a lot!

Best Answer

When every document is about shoes, then the term "shoe" will likely do little for the relevancy score, which may be enough of a filter in many cases. Still, as you point out, if there are no sport shoes and a document contains multiple occurrences of "shoes", that document may be considered relevant enough to be included in the result and without competition from actual sport shoes it will be at the top of the set.

However, if every search is for a shoe then any generic term for shoes will have little significance and you may want to ignore that term completely from either the index (can be done with a stop list) or from the searches (can be done with a token filter).

In both cases the occurrence of "shoes" will have no effect on search results, thus you end up selling a pair of the all-round sneakers that "feels like a sport shoe" which happened to be the only ones in the result set.

Same concept used in language based optimization when we remove the significance of articles and prepositions such as "the" and "on".

This can be applied to just about any text search, including ElasticSearch and Oracle Text.