Workbook Column Manipulations
I have a question about the Workbook summarization using Datameer.
I created a new Worksheet in my Datameer workbook, and entered the formula "GROUPBY" into an empty column/cell Then I referenced a string column in my main data worksheet. Then I performed a "GROUPSUM()" on an integer column from the main data worksheet in the next column. I also input the function "GROUPCOUNT()" in the column after that.
This initially just returned the results for the first 5,000 sample records. I clicked "Process Workbook" on the top toolbar and after a few minutes I had the results for my full ~99 Million record table for each column.
When I went to sort the column by the count column, the data reverted back to the numbers obtained from the initial sample records. Is this an expected result? Is it necessary to Process the workbook each time for each change made to results?
And if so, do you recommend to design all functions, and result workbook first, before selecting "Process workbook"?
Thank you
-
Official comment
Hi Ross, yes that is expected. If any changes are made to the worksheet (including sorting), the sheet will revert to working with sample data instead of the full data set.
If you want a quick glance at some statistics about the full data set, I recommend using the Flip Side: https://documentation.datameer.com/documentation/current/Flip+Side
The Flip Side will show results for the full data set for saved sheets (assuming no changes have been made).Yes, the preview data is used in the design stages and the full results are only visible after processing. Sometimes a few iterations of running the workbook's full data are required to build complex analytics. This helps reveal some corner cases that may not have been visible in the sample data that may be critical to consider.
Comment actions
Please sign in to leave a comment.
Comments
1 comment