Use Regex in Keyword Boosting to Handle Query Inflexion



  • Keyword Tuning is a powerful tool that can be used to change the rank of a document for a query or a related set of queries.

    Boost for Query (Precise)

    You can be precise with it, so that document X is boosted only for add checkbox.

    In this case, the X may or may not turn up for related queries, such as adding checkbox or add checkboxes.

    Boost for Query Forms (Comprehensive)

    Alternatively, you can choose to be comprehensive and use a regular expression to boost document X for all the forms of add checkbox, so that someone looking for adding checkbox or add checkboxes also gets document X on top.

    Regex patterns are extremely useful in languages such as Polish and Russian where nouns declensions are aplenty. They are also a useful tool for Spanish, French, and other languages with complex verb conjugations. You can handle these and other inflexions with regular expressions.

    SearchUnify Compatible Regular Expression Syntax

    Operator Definition Example
    . Match any character. reali.ze = realize, realise
    ? Match the preceding character 0 or 1 times. seas? = sea, seas
    + Repeat the preceding character 1 or more times. 123+ = 1233, 12333, 123333...
    * Repeat the preceding character 0 or more times. 123* = 123, 1233, 12333, 123333...
    {} Repetition range for the preceding character. great{1,5} = great, greatt, greattt, greatttt, greattttt
    | OR operator realize[s|z]e = realize, realize
    () Form a group. extension (0175)|(0172) = extension 0175, extension 0172
    [] Match any character or group between the square brackets. ID[0-9] = ID0, ID1, ID2... ID9
    - Range. To be used with square brackets. [a-zA-Z]{2,4] = All words of length 2, 3, or 4.
    ^ Negation. To be used with square brackets. [^0-9]{10} = Don't match 10-digit numbers, such as 9876323472.

    Instructions

    1. Log in to the instance, find Search Tuning, and click Keyword Tuning.
    2. Select your search client.
    3. Find the document to be boosted and click Add Keywords.
    4. Enter a query to boost a document for a keyword or phrase. Or enter a regular expression to boost a document for a query and all its nouns, verbs (tenses), and other variations.
    5. Click Save.

    Official Documentation: Boost Documents for Specific Keywords



  • @kasey-r You're right. reali.e will be realiaze as well. The doc (.) is a placeholder for any character, not just "s" or "z". The examples in the article are textbook type. In real word, an admin has to be extremely careful. A better way to boost a document for both "realize" and "realise" is to use reali[sz]e. The square brackets limit the choice to two characters.

    As for the word order, I'm sure there must be some way. People have written really fat tomes on regular expressions and I don't count myself among those who have read those tomes. A regex expert should be able to tell. From what I know, a (rather clumsy and wordy) way to handle word order for add checkbox is to boost the pattern (add[(ed)|(ing)|(s)]? checkbox(es)) | (checkbox(es) add[(ed)|(ing)|(s)]?).



  • I'm confused by examples for the . operator and the | operator. Wouldn't reali.ze be read as realiaze, realibze, realicze, realidze...? Also wouldn't realize[s|z]e be read as realizese, realizeze?

    Another question: Is it possible to use regex to account for the order of the words in the search? To use your example, would there be a way to also capture "checkbox adding"?



  • This is great to know about. Would be great to use Regular Expressions in the search as well, see idea: https://community.searchunify.com/topic/306/regular-expressions-supported



Suggested Topics

  • 0
  • 0
  • 0
  • 0
  • 0
  • 0