Importing Excel Issue if some rows have null
We have an excel file that we import. One particular column sometimes does not have any values in it (so its null).
If the first x number of rows being imported are null, then the import "skips" this column and the imported data is then misaligned - even though there are legitimate values in that column.
The bandaid fix that we have in place right now, is that before the file is uploaded, someone will go to that column and replace the null values for spaces in the first 10 rows (its arbitrary) and then imports the spreadsheet and it solves the issue.
Any ideas to resolve this? Has this happened to anyone before?
-
Hi Elwyn,
You can work around this within the product by expanding the Advanced section of the Data Details step while creating the Import job. There, you can increase the number of records read for schema detection. Simply increase the value to something that you're certain will catch a non-null record for the column.
Let us know if you have any further questions! -
On the Configure step, could you try raising the number of sample records to 10,000 to see if increasing the sample will help catch a column with data?
If this doesn't work, I'll need to do some testing with additional properties, I'm pretty sure there's a way to make this happen, but it might end up being a global change - which could have adverse effects on other jobs.
-
Hello Elwyn.
Could you please let us know what Datameer version you use and the MS Office version the source Excel file has been created?
I've run a test in Datameer-11.1.9 and the Excel file (MS Excel 16.29) with 1130 records with the following schema.
- Col1 - 1130 values
- Col2 - no values
- Col3 - 4 values
- Col4 - 4 values
- Col5 - 4 values
Col2 has been included in the dataset schema, and there was no issue with how the data is displayed in the Workbook.
-
Hi Alan - we have done what you suggested, it runs on a schedule at 11.30 so will see if it is successful.
Konsta - we have Excel 2013, although the file is generated from a Aspect system I believe so I dont know if that will have an effect. We are on DataMeer version 11.1.3.1.
For your test, could you add in column headings as well, just so everything is the same. -
OK so I have diagnosed it further and determined it is not a Datameer issue, but an issue with the file being created. The file is being generated automatically from a system called Aspect. The file looks like excel, smells like excel, everything appears to be excel. But I have a strong theory that there is some file encoding going on under the hood that is creating this issue. Because if I simply open the file, save it and close the file - without making any changes at all, it will import just fine.
Apologies for taking up your time on this... like I said, if I sent you the excel file there is nothing to indicate that it would cause a problem. But I suspect when I open it, the act of saving it corrects the encoding and thus it works.
Please sign in to leave a comment.
Comments
8 comments