Adding Column Names automatically (without manually renaming the column names)
How to insert column names automatically (without manually renaming the column names) to an existing DATAMEER workbook which has no headers.There are more than 200 columns in the workbook
-
Thanks Joel for your response.
The file is being sent from an upstream system and is without headers and has 50+ columns.Some cleaning and parsing of data in certain columns is also required.
Example
Column A= cola=ABC Systems Column B = colb=XYZ Column C = colc=$100 Column D cold= 9000After parsing the text to remove the col.. text I get
Column A= ABC Systems Column B = XYZ Column C = $100 Column D = 9000
The files have a defined schema but the column names are not present.Also while uploading the file in datameer we cannot apply the 'custom schema' option as the file needs some cleaning before it can have the format Column A= ABC Systems Column B = XYZ Column C = $100 Column D = 9000.
Please note that the file could have data from multiple sources and depending upon the source the schema could change.So while uploading the file the custom schema option cannot be applied.
Once the file is changed to the format
Column A= ABC Systems Column B = XYZ Column C = $100 Column D = 9000.
The coloumn names will then need to be changed to the schema column names so that the final workbook should have Company Name = ABC Systems Company Address = XYZ etc.Since there are multiple files like this and each file has 50+ columns it is not reasonable to rename the column names manually.
Would appreciate your insight.
-
Thanks for the additional details. In this circumstance, I think the most scalable approach is to follow these steps:
- Upload the original file as is -- use the default column names from Datameer during this step.
- Perform the transformations as required in a Workbook and run the workbook.
- Use the REST API to download a copy of the workbook definition which includes the column names.
- Find and replace the default column names with the desired column names in the resulting JSON file.
- Use the REST API again to update the workbook definitions on the Datameer server.
If using the REST API and some direct exchanges that way, there is an alternative method that could be used as well but would require downloading and re-uploading the data set (if it's large, this may be tedious):
- Upload the original file as is -- use the default column names from Datameer during this step.
- Perform the transformations as required in a Workbook and run the workbook.
- Download the Workbook Results
- Re-upload the results but use the Column Headers feature to upload a separate file with the titles of the columns in the first line of the file.
Please sign in to leave a comment.
Comments
4 comments