possible SDK bug in handling custom plugin intermediate result aggregator
I have written a custom plugin containing specific aggregation functionality. My outer class extends GenericBaseFunctionType, implements IntermediateResultAggregationFunctionType and expects several parameters: two columns/column names and multiple constant values (float, string). There is an inner class which extends IntermediateResultAggregator and - apart from the intermediate methods - implements the aggregate() method. If a worksheet contains a GROUPBY column, a column with a built-in aggregation function, e.g. GROUPCONCAT, and a column with my aggregation function, then the result looks ok. But if I remove the column with the built-in aggregation function, then it looks like my aggregate() method is never called, only aggregateIntermediateResult(null) and computeAggregationResult() are called. As far as I understand the API, the aggregator method aggregate() is the only one which gets direct input from the columns in the referenced sheet and thus should be called even for an IntermediateResultAggregator, shouldn't it?
Am I still missing something or is this behavior caused by a bug in the SDK?
-
After further communication via our service desk and detailed analysis of your provided custom plugin, I wanted to provide the following summary here:
The behavior was not caused by a bug and the rules for implementing an IntermediateResultAggregator are
- newGroup() is guaranteed to be called before each group on the sheet, so use it to initialize necessary state.
- aggregate() or aggregateIntermediate() won't be called until at least one newGroup() is called.
- aggregate() and aggregateIntemediate() may both be called within the same "grouping".
- computeAggregationResult() should only be called once per group.
- computeIntermediateResult() may be called more than once per group. Since we don't copy the instance into something else and are just holding on to the reference, the intermediate result instance should not be cached.
- After dispose() is called, none of the other methods will be called.
- If your aggregator function is associative then one can extend the AssociativeAggregator to get the performance of IntermediateResultAggregator without having to implement the intermediate methods. But, this will cause the computeAggregationResult() to be called multiple times per group in some cases.
Please sign in to leave a comment.
Comments
2 comments