Goal
I have partitioned data stored in HDFS, with a partition type of string. For example, a Hive table partitioned by county name. I would like to be able to choose certain partitions for ingestion.
Learn
To achieve this, specify the path to files with wildcards within the File Or Folder field in the ImportJob/DataLink configuration wizard. Regular expressions are not supported for folder names, but wild cards are allowed.
For example, considering a 2 character country code where the path is as follows:
/warehouse/../country={<country 1>,<country 2>,...}/
If we want to select just the US and Japan countries:
/warehouse/../country={us,jp}/
If we want to do broader pattern matching:
/warehouse/../country=*/
/warehouse/../country=a*/
/warehouse/../country=*s/
For more information see our documentation: File Path and File Name Patterns.
Comments
0 comments
Please sign in to leave a comment.