Search string for specific word
AnsweredI'm trying to search a field for a specific word. The word is "son". This field contains a lot of names and I've found it difficult to isolate the records that only the exact word, son. I've been using the CONTAINS function with " son" and " son " but I pick up out unusual names like sonny, in the first case, or I lose records that end with son. (no spaces after son)
Side question: Is it possible to have multiple conditions CONTAINS. For example (syntax incorrect):
CONTAINS_IC(#family;"son" OR "daughter")
-
Official comment
Hi Philip, thank you for the question.
To be more precise with your matching in this example, I'd recommend using a function that supports Regular Expressions such as the COUNTMATCHES function. This function counts the occurrences of the Regular Expression in a string.
Here is an example syntax to use for matching the whole word of either "son" or "Son":
- \b(?i)son\b
The \b wrappers indicate that the match should be a whole word (to exclude examples like "sonny"). The (?i) section indicates that matching should be case-insensitive. The "son" indicates that these characters should match (in this exact order).
To convert this to a boolean value like CONTAINS or CONTAINS_IC, simply verify that COUNTMATCHES is greater than zero, like this:
- COUNTMATCHES(#A;"\\b(?i)son\\b")>0
You may notice that the original Regular Expression contained only a single slash character before the "b" character and that in the formula representation there are now two slashes. This is intended and required.
The first format is used when using the Formula Builder (the pop-up wizard). The second format, also known as Java escaped Regular Expression, is used and displayed in the Function Bar.
Lastly, if you want to check for matches of two different words you'll need to nest two boolean functions together using an or statement:
- OR(COUNTMATCHES(#A;"\\b(?i)son\\b")>0;COUNTMATCHES(#A;"\\b(?i)daughter\\b")>0)
I hope this helps!
Comment actions -
Thank you Joel, that worked like a charm!
Can you explain, in the most rudimentary terms, what you meant when you said that COUNTMATCHES supports, "Regular Expressions" and how it's different from the CONTAINS function?
Is there a resource that explain the syntax structure and lists expressions and their functions?
-
The supported syntax of each function is included in these documentation links: COUNTMATCHES and CONTAINS
You'll notice that the COUNTMATCHES function accepts the data type of "<string with regex>" as the second argument. The CONTAINS function however only accepts "<string>" types.
In the left pane of the documentation, you'll find links to all supported functions as well to describe the detailed syntax of each function.
-
HI Guys,
Could you please help me to how to match exact word
UNIT 1 should match with UNIT 1 itself not like UNIT 11/12
when i tried with contains its not giving me exact match.
and when i tried with equal of both string its giving only match one but i need to extract data wherever column1 string match with column2 in between the string
please help me ASAP
-
Hello Mallinath.
For example, you have the following input data and would like to check if the string contains UNIT 1 pattern.
1 UNIT 1
2 NOTUNIT 1
3 NOT UNIT 1
4 UNIT 1/2
5 UNIT 12
6 UNIT 1 NOT 2
7 UNIT 1
8 UNIT ONEOur of these records number 1,3,6 and 7 should match the criteria.
You could use the formula MATCHES(#A;"^.*\b(UNIT 1)\b.*$") that returns true or false. It covers cases 1,6 and 7 but thinks that the value UNIT 1/2 also matches the criteria, which is not correct.
To work around this, you could try one more condition and first check whether the string contains UNIT 1/ and replace it with something else. The complete formula will be MATCHES(IF(MATCHES(#A;"^.*\b(UNIT 1\\/)\b.*$");"NOT MATCH";#A);"^.*\b(UNIT 1)\b.*$")
If you will find a regex what will recognize a "/" character as the not matching pattern, the workaround is not required.
I hope this helps you to move further with your use-case.
Please sign in to leave a comment.
Comments
5 comments