Creating categories from common words or group by common words in a text field

Comments

1 comment

  • Alan

    Hi Mary,

    First, do your sanitation steps, removing stop words, converting to lower, etc.

    Then follow these steps:
    1. On a new sheet in column one simply add your sanitized column.
    2. On the same sheet in the second column TOKENIZE(#Sheet1!A) to break out all the words for the record into single row entries.
    3. Create a second new sheet.
    4. In the first column on the second sheet, so we know which record our tokenized counts will represent, GROUPBY(#Sheet1!A)
    5. In column two on the second sheet, GROUPBY(#Sheet1!Tokenized_A)
    6. In column three on the second sheet, GROUPCOUNT()
    7. Create a third new sheet.
    8. In the first column on the second sheet, GROUPBY(#Sheet2!A)
    9. In the second column, GROUPTOPN(#Sheet2!Count;1)
    10. Lasty, in the third column, #Sheet2!Tokenized_A

    This will give you the most prevalent word as well as a count of the number of instances for that word.

    0
    Comment actions Permalink

Please sign in to leave a comment.