Generating Unique string for each record

Comments

8 comments

  • Konsta Danyliuk

    Hi Kavitha,

    To generate unique value for each record:

    1. Apply formula GROUPBY(1) to a new column (let's call it Prep) - this will generate 1 for each existing record.
    2. In the column next to Prep, Apply formula GROUPROWNUMBER() - this gives you unique number for every row.

    0
    Comment actions Permalink
  • Jee

    Hi Konsta,

     

    Thanks for the reply. Is it same as generating unique UUID string for each record. Becaue, I need to generate unique UUID String for each record.

    0
    Comment actions Permalink
  • Konsta Danyliuk

    Do you have any requirements to format of this UUID value or just a random sting?

    0
    Comment actions Permalink
  • Jee

    Just a random string, but it should be unique.

    0
    Comment actions Permalink
  • Alan

    Hi Kavitha,

    The formula GROUPROWNUMBER() will give you a unique entry for every record, since it will be the row number for that record.  You will never have any overlap because each row number is unique!

    Alan

    1
    Comment actions Permalink
  • Jee

    Hi Alan,

     

    Thanks for the reply. But, I have been asked to add UUID for each record to identify uniquely.

    And, one more thing, I need to add some extra columns like warehouse-id, audit_id. I know how to generate values for these columns but not sure how to add these columns to an existing worksheet.

    For example,

    The workbook contains some columns(from import file - source file ) like:

     ACTIVITY_TYPE
     ACTIVITY_PURPOSE
     DESCRIPTION
     ACTIVITY_ID

    But, I need to add some columns like

    WAREHOUSE_ID
    AUDIT_ID

    So the output columns form the worksheet should be

    WAREHOUSE_ID
    AUDIT_ID
    ACTIVITY_TYPE
    ACTIVITY_PURPOSE
    DESCRIPTION
    ACTIVITY_ID

    Regards,

    Kavitha

          

    0
    Comment actions Permalink
  • Konsta Danyliuk

    Hi Kavitha,

    Query number 1. I have been asked to add UUID for each record to identify uniquely.

    Two steps we already mentioned will add unique identificator for every row you have in your dataset.
    For example:

    Name   Age   GroupBy(1)   UniqueID

    Ann      25       1                    1
    Tom     35        1                    2

    In case, according to your requirements Unique ID 1,2,3,4.... are not acceptable, you could add SHA function to this column and get hash values of unique row number, which will look like 345jbjbkjjdlnr34r30i04jc0m3445c.

    Full formula for this column would be SHA_1(GROUPROWNUMBER())

    Name   Age GroupBy(1)   SHA_1(GROUPROWNUMBER())

    Ann      25       1                    345jbjbkjjdlnr34r30i04jc0m3445c
    Tom     35        1                    hg6jbjhl[kwekre0i04jc0pjo93hk3


    Query number 2. 
     I need to add some extra columns.

    Just to clarify - Datameer doesn't work as Excel and you can't just create a new column in a Workbook and add there data you want. If you imported a dataset into Datameer and it contains 4 columns with data, in a Workbook you will work with these information only. You would be able to transform and analyse this data using powerful instruments we provide, but when you need to add some more data which can't be generated from the initial one via math functions, you should import it.

    Back to your example.

    You have a Workbook in Datameer which has 4 columns on the Sheet.

     ACTIVITY_TYPE
     ACTIVITY_PURPOSE
     DESCRIPTION
     ACTIVITY_ID

    You could add two more columns like

    WAREHOUSE_ID
    AUDIT_ID

    Simply apply formula GROUPBY(1) to the next column and rename this column to whatever you need. But keep in mind that information in this column wouldn't be meaningful.

    For example, you have below data in a Workbook (2 columns).

     ACTIVITY_TYPE   ACTIVITY_PURPOSE  
      test                          audit
      release                    upload

    You could add 3rd column, by the approach we've mentioned earlier. But all values in this column will be the same, as they are auto-generated by a certain formula.

     ACTIVITY_TYPE   ACTIVITY_PURPOSE   AUDIT_ID  
      test                          audit                                1
      release                    upload                             1


    If you need to add some "real" ID for different ACTIVITY_TYPE, this should be arranged before data is ingested to Datameer.

    In case I misunderstood your request, could you please provide 2 screenshots, please:

    • 1st - how your Workbook looks right now.
    • 2nd - how you need it to look like (you could create a sample in Excel, just to let us understand final goal)



    0
    Comment actions Permalink
  • Gido

    Another approach might be to create a Simple Custom Function by using our Plug-in SDK and which leverages Java capabilities to generate an UUID.

    0
    Comment actions Permalink

Please sign in to leave a comment.