Worksheet records with latest date
i’m trying to produce a worksheet that can dynamically show records associated with the latest date. I know I can partition the date, however in this instance the underlying dataset is a combo of a Hadoop table and unioned uploaded data, so I’m finding that I can’t use the dynamic date partition. Is there a way that I can retrieve only records with the latest date using GROUPING functions (eg. inner join the worksheet with a dynamic max date)?
-
Thanks for the clarification. Here's one approach that would work:
Create a sheet to calculate the maximum date (date2)
- GROUPBY(1)
- GROUPMAX(#DataSheet!date2)
Create a Join Sheet to inner-join #DataSheet!date2 with #MaximumDate!date2
This essentially acts as a dynamic filter to calculate the maximum date and then use the inner-join to only include records that match the maximum date.
-
Peter, I'm not certain that I understand the whole scope of the workbook and the desired outcome. From my current understanding, it sounds like this is a case of Sorting Data.
Specifically, analysts can select the Top N rows of data after it is sorted. In this case, the Top 1 record could be selected to determine the "latest date".
This assumes that only 1 total record is to be output. If the goal is to select the latest date within a specific grouping of data, perhaps the GROUPMAX function is more in line with your needs.
If neither of these interpretations is correct, would you be able to share an example of a data set and the desired output from the sample?
-
Hi Joel,
I have records that have associated dates. For example, 10k records with 50 attributes, include date field. Of the 10K records, 5k records have date1 value in date field. 5k records have date2 value in date field. Date1 comes before date 2. I want to retrieve only records with date2 values and use that as the baseline for an infographic.
Peter
Please sign in to leave a comment.
Comments
4 comments