Implement ROW_NUMBER/RANK within GROUP
We frequently have use cases that involve ranking within groupings according to certain metrics, for example, showing a most watched TV programme per channel, ordering models by their probability per customer etc, but I cannot figure out how to do this within Datameer. Any help would be much appreciated.
What I am looking for looks a bit like this in SQL-like language:
row_number() over (partition by X order by N1 asc, N2 desc)
-
Hi James, it sounds like you're simply trying to sort the results within a sheet. This is described in detail in our documentation: Sorting Data
Please note that sorting a column or using functions like GROUP_SORT_ASC will have a performance impact for larger data sets. This is expected and it is recommended to only sort when absolutely required (i.e. only on the final sheet). Allowing intermediate results to remain unsorted where possible can reduce the performance overhead required to add sorting.
-
"it sounds like you're simply trying to sort the results within a sheet"
Ranking is used for more than just sorting. It is also for filtering and for grouping by decile/percentile/etc. It is a very useful function. Is there a way to mimic the behavior of the rank() or row_number() functions from SQL in Datameer?
Thanks
-
Hi Rock,
We've got a KB article on getting row numbers here: How to Add Row Numbers to a Sheet
You can also use GROUPROWNUMBER after a GROUPBY on your column that you want to rank to get similar functionality to RANK.
Please sign in to leave a comment.
Comments
3 comments