Random Sample

Comments

1 comment

  • Alan Mark

    Hi Betty,

    First, create a column with RAND() in it to generate a column of random values.

    Since RAND() generates a number between 0 and 1, you can use this column as a comparative column to determine if you're above or below 20%.

    So, you can use it with an IF statement like so:

    IF(#RANDCOLUMN <= 0.20; #PopulationColumn; '')

    Basically, IF the randomly generated number is less than or equal to 0.20 (20%) THEN use your PopulationColumn, else null(two single quotes).

    From there you can put a filter on the new generated column from your if statement to remove null values to get a randomly sampled population of values.

    Alan

    1
    Comment actions Permalink

Please sign in to leave a comment.