Configure "Long-Tail" Search

This section describes the Long-Tail Search feature, that will allow you to have correct search results for words that contain dashes or other non-alphabetic symbols. You can also replace on-the fly the most typical errors customers can make in complex product names.

For example, we have a product Canon PowerShot SX500 IS. But customer can request Canon PowerShot SX-500IS, which default search will not find, because it differs from actual product label.

It's because Magento by default during reindex uses only correct product labels from database, and thus, index will contain only them - making products with complex names "ineligible" for search.

This is where "Long-tail" search come. During reindex and search this feature recognizes the keywords rather by pattern and replaces it either to the empty or some other characters, "correcting" customer's request on-the fly.

In example above the SX500 IS can be converted to the SX500IS and during the search, the SX-500IS also be converted to the SX500IS by replacing '-' symbol to empty char.

This way search will be able to find products by several combinations of spelling the product's name.

Back to Top

Go to System / Search Management / Settings / Mirasvit Extensions / Search
In the section Search Settings go to the option Long tail.
There you can set up regular expressions to receive required search results.

  • Match Expression - the regular expression(s) that parses words for further replacing.

    Parsing goes for search index, during an indexing process, and goes for search phrases during search. E.g. /([a-zA-Z0-9]*[\-\/][a-zA-Z0-9]*[\-\/]*[a-zA-Z0-9]*)/

  • Replace Expression - the regular expression(s) to parse characters to be replaced. Parsing goes in the results of "Match Expression". E.g. /[\-\/]/
  • Replace Char - the character to replace values founded by "Replace Expression". E.g. empty value.

Back to Top

Here is some of most useful cases of long-tail search, implemented as corresponding rules.

  • Automatically remove '-' symbol from product names

    Create a rule with the following parameters:

    • Match Expression - /[a-zA-Z0-9]*-[a-zA-Z0-9]*/
      Matched text: SX500-123, GLX-11A, GLZX-VXV, GLZ/123, GLZV 123, CNC-PWR1
    • Replace Expression -/-/
    • Replace Char - empty
      Result text: SX500123, GLX11A, GLZXVXV, GLZ/123, GLZV-123-123, CNCPWR1
  • Automatically remove '-' and '/' symbols from product names

    Create a rule with the following parameters:

    • Match Expression - /[a-zA-Z0-9]*[ \-\/][a-zA-Z0-9]*/
      Matched text: SX500-123, GLX-11A, GLZX-VXV, GLZ/123, GLZV 123, CNC-PWR1
    • Replace Expression - /[ \-\/]/
    • Replace Char - empty
      Result text: SX500123, GLX11A, GLZXVXV, GLZ123, GLZV123, CNCPWR1
  • Automatically make solid all products names with separators

    Create a rule with the following parameters:

    • Match Expression - /[a-zA-Z0-9]*[-\/][a-zA-Z0-9]*([-\/][a-zA-Z0-9]*)?/
      Matched text: SX500-123, GLX-11A, GLZX-VXV, GLZ/123, GLZV-123-123, CNC-PWR1
    • Replace Expression - /[-\/]/
    • Replace Char - empty
      Result text: SX500123, GLX11A, GLZXVXV, GLZ123, GLZV123123, CNCPWR1
  • Automatically fix misspelled product's name

    Create a rule with the following parameters:

    • Match Expression - /([a-zA-Z0-9]*[\- ][a-zA-Z0-9]*[\-][a-zA-Z0-9]*)/
      Matched text: VHC68B-80, VHC-68B-80, VHC68B80
    • Replace Expression - /[\- ]/
    • Replace Char - empty
      Result text: VHC68B80

Back to Top

Moving Long-Tail Expressions from M1 to M2

Long-Tail expressions, which are used in Search Sphinx for M1 and M2 sliightly differ.

In M1 Search Sphinx you can enter one or more expressions to match, separated by '|' character. In M2 you can not.

Consider the following expression for Search Sphinx for M1:

Example

Match Expression: /[a-zA-Z0-9][ -/][a-zA-Z0-9]([ -/][a-zA-Z0-9]*)?/|/[a-zA-Z]{1,3}[0-9]{1,3}/
Replace Expression:/[ -/]/|/([a-zA-Z]{1,3})([0-9]{1,3})/
Replace Char:$1 $2

It actually contains two separate regexps to match: /[a-zA-Z0-9][ -/][a-zA-Z0-9]([ -/][a-zA-Z0-9]*)?/ and /[a-zA-Z]{1,3}[0-9]{1,3}/ with respective separate expressions for replace.

You need either to reformat that expression, so it will match in single expression, or rewrite this rule as a set of two:

  • First rule

    This rule will implement the first part of original M1 expression.

    • Match Expression: /[a-zA-Z0-9][ -/][a-zA-Z0-9]([ -/][a-zA-Z0-9]*)?/
    • Replace Expression:/[ -/]/
    • Replace Char:$1 $2
  • Second rule

    This rule will implement the second part of original M1 expression.

    • Match Expression: /[a-zA-Z]{1,3}[0-9]{1,3}/
    • Replace Expression:/([a-zA-Z]{1,3})([0-9]{1,3})/
    • Replace Char:$1 $2

Back to Top