Problem
Valid CSV files can't be loaded because columns that appeared to be integers loaded as strings, but they were all numbers. Sometimes conversion errors appeared:
Can't convert 3521 to Integer
Cause
The default character encoding is UTF-8 instead of UTF-16LE.
Solution
Find out which character set is in use with this command:
file -i sample.csv sample.csv: text/plain; charset=utf-16le
Some editors like Text Wrangler display the charset when opening the file. It shows it at the bottom.
Then, switch your default character encoding.
Further Information
The mentioned solution might not always work. In a big data context, character encoding can be only done on a sample and lead into a wrong suggestion.
Comments
0 comments
Please sign in to leave a comment.