Goal
Are you more of a command line aficionado than a user interface promoter? Wish you could navigate Datameer artifacts using the command line? With this guide, you will be able to do a variety of things with Datameer artifacts without the Datameer user interface (UI).
Sample data
This guild will take give you step-by-step instructions on using some of Datameer's Rest API. You can follow along by going to the Datameer App Market and then download the Flight Delays App.
Learn
Find our REST call (GET)
In this example, you will start with the GET command. This command will allow you to retrieve the configurations for a particular Datameer artifact. In order to run a REST API command via command line, you will need your username and password for Datameer, as well as the URL for Datameer.
Open up whichever command line application you prefer. Here is the command you will be using:
curl -u <username>:<password> -X GET 'http://<Datameer-serverIP>:<port-number>/rest/import-job/ <job-configuration-id>' |
You must make sure to fill in your actual username and password for your Datameer instance along with the Datameer URL and job configuration ID. Your job configuration ID can be found in your Datameer UI. Go to your Datameer instance and select the Airports import job and view the Information Browser to the right of the artifacts. You will find the ID here:
A completed REST call with GET will look something like this:
curl -u admin:admin -X GET 'http://localhost:8080/rest/import-job/23' |
Understanding the return for the GET command
Once you run the command, your return should look like this:
{ "version": "4.0.2", "className": "datameer.dap.common.entity.DataSourceConfigurationImpl", "file": { "uuid": "df4fe482-d07f-4f53-aa04-b58aac43f594", "path": "/Users/admin/Applications/Flight Delays/Resources/Airports.imp", "description": "", "name": "Airports" }, "pullType": "MANUALLY", "minKeepCount": 1, "properties": { "TextFileFormat": [ "TEXT" ], "fileNameTimeRange_mode": [ "OFF" ], "fileNameTimeRange_startDate": [ "" ], "filter.minAge": [ "" ], "filter.maxAge": [ "" ], "characterEncoding": [ "UTF-8" ], "recordSampleSize": [ "1000" ], "escapeCharacter": [ "" ], "detectColumnDefinition": [ "SELECT_PARSE_AUTO" ], "collectAdditionalFields": [ "false" ], "quoteCharacter": [ "\"" ], "delimiter": [ "," ], "csv.max-lines-per-record": [ "1" ], "external.store": [ "false" ], "filter.page.does.split.creation": [ "false" ], "fileType": [ "CSV" ], "GenericConfigurationImpl.temp-file-store": [ "1dad24d5-96c2-4af1-8460-b206f8df3cd2" ], "incrementalMode": [ "false" ], "histogram.generation": [ "false" ], "file": [ "flightdelays/ICAOAirports.csv.zip" ], "strictQuotes": [ "false" ] }, "hadoopProperties": "", "dataStore": { "path": "/Users/admin/Applications/Flight Delays/Resources/Examples in S3.dst", "uuid": "a61e955c-576d-47b5-b50a-8554403eddbb" }, "errorHandlingMode": "DROP_RECORD", "maxLogErrors": 1000, "maxPreviewRecords": 5000, "notificationAddresses": "", "notificationSuccessAddresses": "", "fields": [ { "id": 347, "pattern": "", "acceptEmpty": true, "name": "id", "origin": "0", "valueType": "{\"type\":\"INTEGER\"}", "include": true, "version": 3 }, { "id": 348, "pattern": "", "acceptEmpty": true, "name": "ident", "origin": "1", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 349, "pattern": "", "acceptEmpty": true, "name": "type", "origin": "2", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 350, "pattern": "", "acceptEmpty": true, "name": "name", "origin": "3", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 351, "pattern": "", "acceptEmpty": true, "name": "latitude_deg", "origin": "4", "valueType": "{\"type\":\"FLOAT\"}", "include": true, "version": 3 }, { "id": 352, "pattern": "", "acceptEmpty": true, "name": "longitude_deg", "origin": "5", "valueType": "{\"type\":\"FLOAT\"}", "include": true, "version": 3 }, { "id": 353, "pattern": "", "acceptEmpty": true, "name": "elevation_ft", "origin": "6", "valueType": "{\"type\":\"INTEGER\"}", "include": true, "version": 3 }, { "id": 354, "pattern": "", "acceptEmpty": true, "name": "continent", "origin": "7", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 355, "pattern": "", "acceptEmpty": true, "name": "iso_country", "origin": "8", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 356, "pattern": "", "acceptEmpty": true, "name": "iso_region", "origin": "9", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 357, "pattern": "", "acceptEmpty": true, "name": "municipality", "origin": "10", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 358, "pattern": "", "acceptEmpty": true, "name": "scheduled_service", "origin": "11", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 359, "pattern": "", "acceptEmpty": true, "name": "gps_code", "origin": "12", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 360, "pattern": "", "acceptEmpty": true, "name": "iata_code", "origin": "13", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 361, "pattern": "", "acceptEmpty": true, "name": "local_code", "origin": "14", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 362, "pattern": "", "acceptEmpty": true, "name": "home_link", "origin": "15", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 363, "pattern": "", "acceptEmpty": true, "name": "wikipedia_link", "origin": "16", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 364, "pattern": "", "acceptEmpty": true, "name": "keywords", "origin": "17", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 365, "pattern": "", "acceptEmpty": false, "name": "dasFileName", "origin": "fileInfo.fileName", "valueType": "{\"type\":\"STRING\"}", "include": false, "version": 3 }, { "id": 366, "pattern": "", "acceptEmpty": false, "name": "dasFilePath", "origin": "fileInfo.filePath", "valueType": "{\"type\":\"STRING\"}", "include": false, "version": 3 }, { "id": 367, "pattern": "", "acceptEmpty": false, "name": "dasLastModified", "origin": "fileInfo.lastModified", "valueType": "{\"type\":\"DATE\"}", "include": false, "version": 3 } ] } |
This is a return of the configurations that have been set up to import the data into Datameer. If you look closely, you will recognize some things that you see in the import wizard within the Datameer UI:
"version": "4.0.2", "className": "datameer.dap.common.entity.DataSourceConfigurationImpl", "file": { "uuid": "df4fe482-d07f-4f53-aa04-b58aac43f594", "path": "/Users/admin/Applications/Flight Delays/Resources/Airports.imp", "description": "", "name": "Airports" |
The first part gives you general information about your Datameer instance, the path of the artifact in Datameer, any description and the name of the artifact.
The next portion (below) gives you more details on the configurations of the import, such as file format, if a histogram will be generated for this job, CSV configurations, record sample size, partitioning, custom hadoop properties, email notification settings, etc. If you were in the Datameer UI, you would see the same type of configurations by right clicking on the artifact and selecting “Configure”.
"pullType": "MANUALLY", "minKeepCount": 1, "properties": { "TextFileFormat": [ "TEXT" ], "fileNameTimeRange_mode": [ "OFF" ], "fileNameTimeRange_startDate": [ "" ], "filter.minAge": [ "" ], "filter.maxAge": [ "" ], "characterEncoding": [ "UTF-8" ], "recordSampleSize": [ "1000" ], "escapeCharacter": [ "" ], "detectColumnDefinition": [ "SELECT_PARSE_AUTO" ], "collectAdditionalFields": [ "false" ], "quoteCharacter": [ "\"" ], "delimiter": [ "," ], "csv.max-lines-per-record": [ "1" ], "external.store": [ "false" ], "filter.page.does.split.creation": [ "false" ], "fileType": [ "CSV" ], "GenericConfigurationImpl.temp-file-store": [ "1dad24d5-96c2-4af1-8460-b206f8df3cd2" ], "incrementalMode": [ "false" ], "histogram.generation": [ "false" ], "file": [ "flightdelays/ICAOAirports.csv.zip" ], "strictQuotes": [ "false" ] }, "hadoopProperties": "", "dataStore": { "path": "/Users/admin/Applications/Flight Delays/Resources/Examples in S3.dst", "uuid": "a61e955c-576d-47b5-b50a-8554403eddbb" }, "errorHandlingMode": "DROP_RECORD", "maxLogErrors": 1000, "maxPreviewRecords": 5000, "notificationAddresses": "", "notificationSuccessAddresses": "", |
The final part of the return shows the columns and column configurations for the import job:
"fields": [ { "id": 347, "pattern": "", "acceptEmpty": true, "name": "id", "origin": "0", "valueType": "{\"type\":\"INTEGER\"}", "include": true, "version": 3 }, { "id": 348, "pattern": "", "acceptEmpty": true, "name": "ident", "origin": "1", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, { "id": 349, "pattern": "", "acceptEmpty": true, "name": "type", "origin": "2", "valueType": "{\"type\":\"STRING\"}", "include": true, "version": 3 }, ....................................... { "id": 367, "pattern": "", "acceptEmpty": false, "name": "dasLastModified", "origin": "fileInfo.lastModified", "valueType": "{\"type\":\"DATE\"}", "include": false, "version": 3 } ] } |
Downloading the return
Now that you have an understanding of what is contained in this return, you will now download the data using a different GET command so you can make changes to it!
The command will look very similar to our first GET command, except you are adding how you would like to save the file:
curl -u <username>:<password> -X GET ‘http://<Datameer-serverIP>:<port-number>/ rest/import-job/<job-configuration-id>' > Airports.json |
When you run this command, it will save the file in whatever directory you are currently on in your command line, or you must specify the directory you would like to save to. For example, you can use the following command to navigate to your downloads folder:
cd /Users/<username>/Downloads |
Comments
0 comments
Please sign in to leave a comment.