How do I convert a list of tokens back into a single string (e.g., with a space as a delimiter)
I have converted a set of documents into a set of lists of strings so I could apply some the text & text mining functions to normalize the documents (e.g., correct common spelling errors). Now I want to derive character-based ngrams, for which I need string (not list) input. How do convert the lists back to strings? I couldn't find a function for that.
-
The T function converts most types to String type. With a List type object though, it will include extra punctuation when converting it to a String.
Here is an example:
T([apple, orange]) returns a string with value: ["apple", "orange"] In your later analysis the following characters may be undesirable: [],"
You can remove these by using REPLACE, REPLACEALL or SUBSTR functions.
Please sign in to leave a comment.
Comments
4 comments