Understand the consequences of changing the schema of a dataset I use in Datameer.
For example: What will happen if a column is dropped in the source table?
Changing the schema of a datasource will impact all artifacts that reference the data.
- When a column is dropped from the source table, ImportJobs and DataLinks pointing to this table will start to fail with the below exception.
java.lang.RuntimeException: Unknown column '<dropped column name>' in 'field list'
This will happen even if the removed column is not included in the import, since Datameer reads the whole table schema first and then excludes undesired columns. To fix this error and let ImportJobs/DataLinks run again, it is required to reconfigure the ImportJob and DataLink artifacts and re-scan the data schema to pick up the schema changes in the source data.
- Workbooks based on these ImportJobs and DataLinks will fail if any of the dropped columns are referenced on a Worksheet, even when it is just a referenced column not used in any formula.
java.lang.IllegalArgumentException: Sheet with name '<Sheet Name>' has no column '<dropped column name>' (columns: <existing column 1>,<existing column 2>,...)
To fix this, it is required to remove the references to nonexistent columns from Worksheets. If there were no references to dropped columns, the Workbook will run fine without any changes.
- If you have ExportJobs using Worksheets that contain references to dropped columns, they will fail with the below exception.
java.lang.IllegalArgumentException: Workbook has no column on index: <index of nonexistent column>
To fix this, it is required to reconfigure the ExportJobs Mapping step to reflect the source Worksheet changes.
Please sign in to leave a comment.