How to Enhance the Search by SKU in Magento 2 with Long Tail Search
- Ivan Leontiev
- Extensions
- Jan 18, 2021
- 8 min read
The long tail search is one of the most useful features in our Magento search extensions. It eliminates the need to enter an SKU or other identifier character by character, greatly streamlining an online store's UX. The long tail search is included in our Elastic Search Ultimate, Sphinx Search Ultimate, and Advanced Sphinx Search Pro modules.
However, we've noticed that many of our customers aren’t sure how to use the long tail search to its full potential, especially when it comes to defining search query patterns. I’m going to elaborate further on how you can get the most out of this powerful feature in this article:
- How Magento 2 Long Tail Search Works
- How to Set Up Magento 2 Long Tail Search
- Examples of Improved Magento 2 SKU Search
- Summary
How Magento 2 Long Tail Search Works
Long tail search in our Magento site search modules works in many ways like a spell correction system. First you need to define a search pattern, then you manually alter characters within it. Once you set everything up, the search will regard the original pattern and the pattern with altered characters as if they were one and the same.
This feature is most striking if the visitors in your online store need to search by SKU, model number, ISBN, DOI, and similar product identifiers.
By default, Magento 2 search will only recognize these identifiers if the query is an exact match. Something as miniscule as a single space or a hyphen will invalidate the search. For example, if your SKU is SX500-123 and the visitor types SX500123, the search will return zero results.
This is a significant problem that can end up costing you sales. Visitors who search by such specific identifiers are often ready to make a purchase straight away. Being unable to find what they need is all the more frustrating for them. This makes them prone to leaving your store and shopping elsewhere - right at the end of the funnel!
Preventing this from happening in your online store is in your best interest. With the long tail search feature, you can ensure that the visitors will get to the product they need regardless of the format they use for their queries. This will work wonders for your store's UX and conversion rate.
You and your visitors will never again have to ask: "Why can I not search by SKU in Magento?"!
How to set up Magento 2 Long Tail Search
The Basic Principles
Setting up the long tail product search isn't particularly hard. It doesn't matter if you use Elastic Search or Sphinx Search either - the feature works exactly the same in all of our modules. You only need to configure three fields: Match Expression, Replace Expression, and Replace Char:
- Match Expression field is where you define the search query patterns. They have to match the product identifiers either as Magento indexes them or as the visitors tend to write them.
- Replace Expression field is where you specify the characters inside the pattern the feature will replace. If the pattern matches what Magento indexes, you have to specify what the visitors add incorrectly. If it matches what the visitors often type incorrectly, you have to specify what they tend to forget.
- Replace Char field is where you set the characters that the feature will use as replacements. This field isn't mandatory. If you leave it empty, the long tail search will simply omit the characters you specified in the Replace Expression field instead. For example, SX500-123 will become SX500123.
Once you set everything up, searching by SKU in Magento 2 will treat the pattern you defined in the Match Expression field and the pattern with the altered characters exactly the same. For instance, if your SKU is SX500-123 and you replace a hyphen, typing both SX500-123 and SX500123 in the search bar will show the results for SX500-123.
That's great, but how do you define these patterns? Let's find out now:
How to Format the Patterns for the Long Tail Search
The long tail search uses the PHP flavor of Regex (short for Regular Expression), a standard pattern system for search queries. Using Regex may be daunting at first if you don't have any programming skills. However, it's not as hard as it seems. Just keep these basic principles in mind:
Each pattern has to be delimited by a set of slashes:
/pattern goes here/
The simplest way to use Regex is to define the exact values:
/SX500-123/
However, that's not very useful since only the exact match will work. If you'd like to define a more abstract pattern, you have to specify a character class. Regex uses square brackets [] for that purpose. Any values you list within them will be considered valid. For example, this pattern:
/[1at]/
is valid for 1, a or t in the search query.
You can specify character ranges instead of adding all the relevant characters manually. Let's assume your SKUs only use letters and numbers. Regex is case-sensitive, so we have to type a-z, A-Z, and 0-9 within the brackets:
/[a-zA-Z0-9]/
By default, a character class will only work for single-character queries. For instance, the earlier example will match 1, but not 12. If you need the pattern to match longer queries, you have to add one of the quantifiers right after the character class:
/[a-zA-Z0-9]*/
There are four commonly used quantifier types:
- Match queries zero or more characters long, marked by an asterisk: *
- Match queries one or more characters long, marked by a plus sign: +
- Match queries zero or one character long, marked by a question mark: ?
- Match queries a specific number of characters long, marked by a number within curly brackets: {3}
You can use backslashes \ to escape special characters, like forward slashes / and other backslashes. That way the long tail search will recognize them literally instead of considering them a part of Regex's syntax:
/\//
That's it for the basics. However, keep in mind that the patterns you created have to match the relevant product variables as close as possible, or else the feature will just pick the first qualifying string in the Magento 2 search - even if it has nothing to do with what you intended to match.
Let's narrow the pattern down. Say, all of your SKUs have a hyphen in the middle. You can just add two identical character classes and separate then with a hyphen. You don't need to escape it:
/[a-zA-Z0-9]*-[a-zA-Z0-9]*/
What if some SKUs have different characters, like forward slashes or spaces? Just add a new character class and place all the relevant characters there. You have to escape the forward slash, but spaces are always taken literally:
/[a-zA-Z0-9]*[ -\/][a-zA-Z0-9]*/
You should be able to define the majority of SKU patterns with character classes and quantifiers alone, but Regex has even more options (or tokens) you can utilize for more advanced customization.
Regex is a well-documented system, so you can easily look the tokens up online. I recommend using Regex101 in particular. This resource lets you check your pattern's syntax, how it'll work and whether it'll match the queries you had in mind in real time.
Examples of Improved Magento 2 SKU Search
Let's put your new-found knowledge into practice and set up your own enhancements for Magento 2 SKU search. I'll show you how to set up the long tail feature for its most common use cases: adding, replacing, and removing special characters during a search.
Adding Characters During Search
Let's assume your SKUs are separated by a hyphen, space or a forward slash, e.g. SX500/123, so you have to redirect the queries where the visitors type them without any separators. In this case, you'll need to use the Match Expression field to define the pattern for SKUs as Magento indexes them:
/[a-zA-Z0-9]*[ -\/][a-zA-Z0-9]*/
Add the relevant separators to the Replace Expression field:
/[ -\/]/
Leave the Replace Char field empty.
Once you set this expression up, the feature will treat the catalog searches for SX500123 the same as the searches for SX500/123.
Replacing Characters During Search
Let's assume your SKUs are only separated by a hyphen, e.g. SX500-123, so you have to redirect the queries where the visitors type them with a space or a forward slash. In this case, you'll also need to define the pattern for SKUs as Magento indexes them in the Match Expression field:
/[a-zA-Z0-9]*-[a-zA-Z0-9]*/
Add the separator to the Replace Expression field:
/-/
Add the characters which visitors often use by mistake to the Replace Char field:
/[ \/]/
That way, the feature will treat search queries with SX500 123 and SX500/123 the same as SX500-123.
Removing Characters During Search
Let's assume your SKUs are written without any special characters, e.g. SX500123, so you have to redirect the queries where the visitors type them with a space, a hyphen or a forward slash. In this case, we'll need to use the Match Expression field to define the pattern for SKUs as visitors type them.
Add the SKU pattern with the separators to the Match Expression field:
/[a-zA-Z0-9]*[ -\/][a-zA-Z0-9]*/
Add the separators to the Replace Expression field:
/[ -\/]/
Leave the Replace Char field empty.
That way, the feature will treat searches for SX500-123 the same as searches for SX500123.
Long tail can let you enhance your Magento site search even more. For instance, it's possible to set up a long tail search for arbitrarily divided SKUs, e.g. SX series 500 model 123, however this requires a deeper knowledge of Regex. If you'd like to implement something similar, you should consult with your developer as to the best way to set it up.
Summary
Magento 2 long tail search is invaluable for ecommerce. This feature drasticaly improves searching by model number, SKU, ISBN, DOI, and similar product identifiers. In essence, this feature lets you redirect incorrectly formatted variables to their correct format automatically. This will make the search much more user-friendly and reduce churn.
Long tail search works with the PHP flavor of Regex, a standard pattern system for defining expressions in human language. It's a very powerful tool. While I've covered the basics in this post, you should also check out Regex101 if you intend to define more advanced patterns.
The most common use cases for long tail search are adding, replacing or removing the special characters the visitors use (or forget to use) as separators from search queries.