Tag List
I'd like to try the Tag List feature. Let's say I have a flat text file.
Would should I use for the file type when I upload it into Datameer?
-
Hey Amin,
If your flat text file has delimiters, you can use CSV/TSV. If that doesn't work well, have a look at the Documentation page for Types of Data Supported to figure out what kind of file type might fit best.
For a tag list widget, you need a string column as the Word entry. The Value to measure the word entry must a number.
To work with the raw values, when configuring the schema, you can just choose the column data types appropriately or they can be converted in a workbook as necessary.
Thanks,
Jason -
If its completely unstructured text (no real delimiters), then I would recommend to either use the "HTML File Type" format or use the regex format.
The HTML format will import a large string of your text in one field, then you can use functions in the workbook to split your text based on words, spaces, etc. (TOKENIZELIST(),TOKENIZE()).
The regex format would allow you to define a regex expression to parse the file during ingestion.
-
No worries!.. One more note, if the file is very large (e.g. GB) then I recommend to utilize the TOKENIZE() function in a different sheet referencing the field with the text. The reason being that it will split the text and create a new row for every word/paragraph etc. If you do it from the same sheet then the GB row will be repeated creating a too large of a file to render and write. Hope this makes sense.
Please sign in to leave a comment.
Comments
5 comments