Unique Record count
I have a datalink which has multiple record for each account numbers. How to get the unique record counts as the result in a sheet
-
Hello Ganesh.
There are several options to get the unique record count number for a dataset accessible via a DataLink. As a DataLink is just a pointer to data one should materialize it at the saved Workbook Sheet. Here are the methods you could use.1.
- Create a new Workbook based on the DataLink.
- Use Deduplicating Data instrument to create a new Sheet without duplicated records (based on all or particular columns).
- Execute the Workbook with the Dedup Sheet kept and check unique records via the Inspector -> Column -> Data Profile.
2.
- Create a new Workbook based on the DataLink.
- Create a new Sheet.
- Introduce GROUPBY function for the desired column(s) to suppress duplicated records.
- Execute the Workbook and check unique records via the Inspector -> Column -> Data Profile.
3
- Create a new Workbook based on the DataLink.
- Create a new Sheet.
- Introduce GROUPBY function for the desired column(s) to suppress duplicated records.
- In the next column called Index, add a number, e.g. 1 - this will create a new column with the same value for each record.
- Create a new Sheet
- Add GROUPBY function for the column Index and then GROUPCOUNT.
- Execute the Workbook to get the records to count for the whole dataset.
Note that the last method might be resource-consuming for the cluster.
Please sign in to leave a comment.
Comments
1 comment