Synonym Phrase Help



  • we have in our synonyms file:
    odm: oracle\ data\ manager

    We EXPECT the phrase to be searched: "oracle data manager"

    But we are seeing each word being searched: oracle, data, manager

    Is synonym exact phrase possible?



  • Hi @ChristieJoy @jshenricks, you are correct about the current behavior for added synonym phrases. I also checked a similar case like "sep: symantec endpoint protection" and found it working as an AND within keywords in any order. Let me get it thoroughly checked that why \ backlash isn't working the way it is supposed to and how we achieve the use case with exact phrase match which you've mentioned. I'll keep you posted here. Thanks!



  • To expand upon what Christie is saying, based on documentation, it appears that SearchUnify can perform match_phrase queries for synonyms when using the backslash. Is this correct, or is this not supported with synonyms?

    Here's how this is explained on https://docs.searchunify.com/Content/NLP-Manager/Add-Synonyms.htm.

    "Enter a keyword and all its synonymous phrases and click . Each synonym is separated from others by a comma. If there a space or a hyphen in a phrase, then replace the spaces and hyphens with a backward slash and a space. e-mail becomes e\ email and electronic mail becomes electronic\ mail."

    Based on this, we expect to find documents matching the following criterias:

    • All the terms must appear
    • All the terms must have the same order

    For example, this is the expected behavior in something like Elasticsearch, if you index the following documents (using standard analyzer for the field foo):

    { "foo":"I just said hello world" }
    { "foo":"Hello world" }
    { "foo":"World Hello" }

    This match_phrase query would only return the first and second documents :

    { "query": {
    "match_phrase": {
    "foo": "Hello World"
    }
    }
    }

    So when applying this to a real world example with our synonyms, this is what I would expect:

    Synonym - sep: symantec endpoint protection

    { "title":"Installing Symantec Endpoint Protection" }
    { "title":"SEP best practices" }
    { "title":"Protection error in Symantec endpoint clients" }

    This match_phrase query should only return the first and second documents:

    { "query": {
    "match_phrase": {
    "title": "sep"
    }
    }
    }

    Instead, when a synonym for "sep: symantec endpoint protection" is applied, we are seeing searches for the synonym "sep" act like this:

    symantec AND endpoint AND protection (in any order or position) OR sep


Log in to reply
 

Suggested Topics

  • 3
  • 2
  • 1
  • 3
  • 10
  • 8
  • 4
  • 3