May 14, 2019

Query Understanding: An efficient way how to deal with long tail queries

Our data shows that when people search for a certain product, most of them use roughly 1.5 words. These short queries unfortunately make it hard for full-text search to offer them relevant results. While there is improvement to be found in using filters, there are often so many that it can be confusing. One of the ways to make searching more effective is to use the ‘learning to rank’ approach, which creates an optimal ranking of results. However, even this machine-learning method is not all-mighty – and that’s why we’ve come up with Query Understanding, a great companion to ‘learning to rank’.

Full-text search is great if you use longer queries, let’s say four or more words, or really unique terms, such as a product code. In these cases, it usually gives you exactly what you are looking for in the first position or it shows you a “No results” page. Either way, both situations are better than offering you a list of completely irrelevant results, which is often the case if you use shorter queries but minimize or leave out unique words.

How Search Works in the Real World

This is actually how most people search. Our data shows that, on average, people use 1.43 words per query (with a standard deviation of +/- 0.58, computed for our 150+ most active clients) and while we do have clients with many product code-only queries (this is depends on domain), the median of product code-only queries is only 2.9%.

People using short, non-specific queries is the cornerstone of full-text search technology issues. The drawbacks of full-text search are the same as its main advantages – it can find anything that matches the query, literally anywhere. And so, it usually finds tons of results and leaves the searcher to sort through them. Full-text score, computed from short, non-specific queries, is not good enough on its own to produce the most relevant ranking of results.

How Should eCommerce Businesses Respond?

That’s why ‘learning to rank’ exists. This machine learning-based method, which orders results by a numerical value representing their relevance, combines human behaviour with full-text metrics and creates an optimal ranking of results. At Luigi’s Box, we’ve already incorporated this mechanism into our search-related products. We have seen, however, that despite being very helpful, ‘learning-to-rank’ is not the silver bullet for the ultimate search solution. It is challenging to learn the right ranking model for a query that is seen once a month, and there are many domains where a long tail of such queries exists and must be taken care of.

The solution would be to nudge users to rely more on filters to narrow the search results based on the particular parameters of their interest. However, when considering filters, there can be an overwhelming number of options. For instance, one of our clients has more than 2,000 different parameters (depending on the product category) and almost 19,000 different values for those parameters. It’s not possible to build an intuitive and simple interface for all of them.

Query Understanding Available Now

Our approach to this problem is to recognize users’ search intent and turn the maximum number of query terms into key-value filters. For instance, if a user types in “open headphones,” we automatically recognize the “Acoustic system: open” filter, which is normally not available through a standard faceted interface, and inform the user that they can turn it off should they change their mind. We can cope with any of the thousands of parameters available and turn on filters that are otherwise not accessible to the users when the system recognizes a need.

A special, but actually quite frequent, case is when people search for a category. In that case, we redirect them to the category page, thereby driving them to the (often) manually-curated page with the eCommerce site’s context-dependent content, such as banners, promotions, and so on. Now they’ve been shown a relevant result and your current deals, to boot!

Our new feature, Query Understanding, is currently able to recognize categories, brands and parameters of products, which helps customers get more relevant results for short queries. It is available for all of our customers – the only thing we need is structured information about your products. Why not get started?