Configure "Long-Tail" Search
This section describes the Long-Tail Search feature which will allow you to have the correct search results for words that contain dashes or other non-alphabetic symbols. You can also replace the most typical errors customers make in complex product names on the fly .
What is Long-Tail Search?
For example, we have a product Canon PowerShot SX500 IS
. The customer can request Canon PowerShot SX-500IS
, which a default search will not find, because it differs from the actual product label.
This is because Magento by default during reindex uses only correct product labels from the database, thus, ensuring the index will contain only them - making products with complex names "ineligible" for search.
This is where "Long-tail" search comes in. During the reindex and search, this feature recognizes keywords by pattern and replaces them either with empty space or some other characters, "correcting" customer's request in real time.
In the example above, the SX500 IS
can be converted to the SX500IS
and during the search, the SX-500IS
is also be converted to the SX500IS
by replacing the '-' symbol with empty char.
This way, the search will be able to find products by several combinations of spelling the product's name.
Also, please learn more about configuring Long-Tail Search for your store in our Blog article.
Configuring Long-Tail Search
Go to System / Search Management / Settings / Mirasvit Extensions / Search
In the section Search Settings, go to the option Long tail.
There you can set up regular expressions to receive required search results.
-
Match Expression - the regular expression(s) that parses words for further replacing.
Parsing is used for search index, during an indexing process, and goes for search phrases during a search. E.g.
/([a-zA-Z0-9]*[\-\/][a-zA-Z0-9]*[\-\/]*[a-zA-Z0-9]*)/
- Replace Expression - the regular expression(s) for parsing characters to be replaced. Parsing goes in the results of "Match Expression". E.g.
/[\-\/]/
- Replace Char - the character to replace values founded by "Replace Expression". E.g.
empty value
.
Configuring Long-Tail Search
Here are some of the most useful cases of long-tail search, implemented as corresponding rules.
-
Automatically remove '-' symbol from product names
Create a rule with the following parameters:
- Match Expression -
/[a-zA-Z0-9]*-[a-zA-Z0-9]*/
Matched text:SX500-123
,GLX-11A
,GLZX-VXV
,GLZ/123
,GLZV 123
,CNC-PWR1
- Replace Expression -
/-/
- Replace Char - empty
Result text:SX500123, GLX11A
,GLZXVXV
,GLZ/123
,GLZV-123-123
,CNCPWR1
- Match Expression -
-
Automatically remove '-' and '/' symbols from product names
Create a rule with the following parameters:
- Match Expression -
/[a-zA-Z0-9]*[ \-\/][a-zA-Z0-9]*/
Matched text:SX500-123
,GLX-11A
,GLZX-VXV
,GLZ/123
,GLZV 123
,CNC-PWR1
- Replace Expression -
/[ \-\/]/
- Replace Char - empty
Result text:SX500123
,GLX11A
,GLZXVXV
,GLZ123
,GLZV123
,CNCPWR1
- Match Expression -
-
Automatically make solid all products names with separators
Create a rule with the following parameters:
- Match Expression -
/[a-zA-Z0-9]*[-\/][a-zA-Z0-9]*([-\/][a-zA-Z0-9]*)?/
Matched text:SX500-123
,GLX-11A
,GLZX-VXV
,GLZ/123
,GLZV-123-123
,CNC-PWR1
- Replace Expression -
/[-\/]/
- Replace Char - empty
Result text:SX500123
,GLX11A
,GLZXVXV
,GLZ123
,GLZV123123
,CNCPWR1
- Match Expression -
-
Automatically fix misspelled product's name
Create a rule with the following parameters:
- Match Expression -
/([a-zA-Z0-9]*[\- ][a-zA-Z0-9]*[\-][a-zA-Z0-9]*)/
Matched text:VHC68B-80
,VHC-68B-80
,VHC68B80
- Replace Expression -
/[\- ]/
- Replace Char - empty
Result text:VHC68B80
- Match Expression -
Moving Long-Tail Expressions from M1 to M2
Long-Tail expressions, which are used in Search Sphinx for M1 and M2 slightly differ.
In M1 Search Sphinx, you can enter one or more expressions to match, separated by '|' character. In M2, you can not.
Consider the following expression for Search Sphinx for M1:
Example
Match Expression:/[a-zA-Z0-9][ -/][a-zA-Z0-9]([ -/][a-zA-Z0-9]*)?/|/[a-zA-Z]{1,3}[0-9]{1,3}/
Replace Expression:
/[ -/]/|/([a-zA-Z]{1,3})([0-9]{1,3})/
Replace Char:
$1 $2
It actually contains two separate regex to match: /[a-zA-Z0-9][ -/][a-zA-Z0-9]([ -/][a-zA-Z0-9]*)?/
and /[a-zA-Z]{1,3}[0-9]{1,3}/
with respective separate expressions for replace.
You need either to reformat that expression, so it will match in single expression, or rewrite this rule as a set of two:
-
First rule
This rule will implement the first part of the original M1 expression.
- Match Expression:
/[a-zA-Z0-9][ -/][a-zA-Z0-9]([ -/][a-zA-Z0-9]*)?/
- Replace Expression:
/[ -/]/
- Replace Char:
$1 $2
- Match Expression:
-
Second rule
This rule will implement the second part of original M1 expression.
- Match Expression:
/[a-zA-Z]{1,3}[0-9]{1,3}/
- Replace Expression:
/([a-zA-Z]{1,3})([0-9]{1,3})/
- Replace Char:
$1 $2
- Match Expression: