CommandLine API Tool

Introduction

The command line tool is a tool that allows data to be imported into the repository via the command line, thus allowing the import to be automated.

The tool works by making restful web calls to a running instance of the repository. It expects to be configured via a set of options passed in via the command line.

It expects to be given, as part of these options, one or more file or directory names. This list of files and directories will be iterated across (and recursively down, in the case of subdirectories), and any file found with the '.xml' extension will be deemed to be a record, and any other file found will be deemed to be an attachment.

There are some interesting rules in play here:

  • Any directory with more than one record found, will only have the records processed, and any attachment in the directory will be ignored.

  • Any directory with only one record found, will have the record and all attachments in the directory processed.

  • Any directory with just attachments will have the attachments ignored.

The xml files are expected to conform to the schema supported by the server being targeted. This schema can be found at the URL path /schema/current of the server being targeted.

In the interests of safety, the command line tool will assume, unless instructed otherwise, that it is being run against a default test instance of the repository, found at http://alpha.tdar.org.

Because the command line tool is making use of restful http calls there is no need for it to be run on the server that is being targeted.

For the more technically minded the tool is provided by the Java org.tdar.utils.CommandLineAPITool.java file found in the tDAR code base. A reading of its code will show developers how to create equivalent tools in their language of choice should they wish to write to the restful http interface directly.

The command line options

These are the current command line options, without some of which the tool will not run.

  • help – list the command line options described below.

  • http – run using the http protocol (the default is https).

  • log – log messages at the Info level, and copy them to the console.

  • username <argument> – The name of the user account that command line tool is to be run with. This must be an already registered user within the running instance of tDAR/FAIMS being targetted. Required, unless running against the default alpha system.

  • password <argument> – The password associated with the user name. Required, unless running against the default alpha system.

  • Host <argument> – The hostname of the repository that the command line import is targeting. If it is not visible by DNS then this must be an IP address. In the interests of safety this defaults to alpha.tdar.org.

  • file <argument> – The name of a file or directory that is to be ingested. Can occur 1 to many times.

  • config <argument> – An optional file that allows one to specify the command line arguments that actually take arguments in a file, rather than on the command line. Anything entered on the command line will override the values read from the file.

  • projectid <argument> – The id of the project in the repository to which the uploaded artefacts are to be added. This project id will replace any project referenced in the xml resource files.

  • accountid <argument> The id of the billing account associated with the user that is to be used for this transaction.

  • sleep <argument> – The time to wait, in milliseconds, between the import of each record. This value currently defaults to zero.

  • logFile <argument> – The file which tracks the import of each record. Essentially just a list of the records that were successfully processed. This file will be appended to by the application on each run. Don't be confused: there is a separate log file in which the application logs its activities.

  • fileAccessRestriction <choice> – The access restriction that is to be applied to all the attachment files ingested during this run of the tool. The choice can currently be one of [PUBLIC | EMBARGOED | CONFIDENTIAL]. This value currently defaults to PUBLIC

Return codes

The command line line tool has the following return codes:

  • -1 : there was a problem encountered with the parsing of the arguments

  • 0 : the run proceeded without issue

  • any number > 0 : the number of files that the tool was not able to import successfully.

Running the tool

There are currently two ways to run the command line tool:

  • From within the deployed Java web application directory

  • From within the project source code

From within the deployed Java web application directory:

java -cp ../ROOT/WEB-INF/lib/*:. org.tdar.utils.CommandLineAPITool

From within the project source code:

mvn exec:java -P apitool

If no arguments are provided the application simply lists the help and exits:

martin:classes admin$ java -cp ../ROOT/WEB-INF/lib/*:. org.tdar.utils.CommandLineAPITool

args are: []

usage: TDAR/FAIMS cli api tool

-accountid <accountid> TDAR/FAIMS the users billing account id to use

-config <config> optional configuration file

-file <file> the file(s) or directories to process

-fileAccessRestriction <fileAccessRestriction> the access restriction to be applied - one of [PUBLIC | EMBARGOED | CONFIDENTIAL]

-help print this message

-host <host> override default hostname of alpha.tdar.org

-logFile <logFile> TDAR/FAIMS logFile

-password <password> TDAR/FAIMS password

-projectid <projectid> TDAR/FAIMS project id. to associate w/ resource

-sleep <sleep> TDAR/FAIMS timeToSleep

-username <username> TDAR/FAIMS username

-------------------------------------------------------------------------------

Visit https://dev.tdar.org/confluence/display/TDAR/CommandLine+API+Tool for documentation on how to use the TDAR/FAIMS commandline API Tool

-------------------------------------------------------------------------------

The config file

The config file is simply a standard Java properties file. Any of the command line arguments that actually take an argument can be placed in this file. E.g.:

username=benben

password=ben

file=/Users/admin/Documents/CLTest/exampleOne/

host=localhost:8080

There is one caveat to be aware of: the file property can have multiple values, but they must be comma separated. E.g.:

file=/Users/admin/Documents/CLTest/exampleOne/,/Users/admin/Documents/CLTest/exampleTwo/

Logging

The tool makes use of the log4j library to record what it is doing.

If launched with the -log command line option any existing configuration will be ignored, the logger will be set to the INFO level, and all output will be directed to the console.

More on the log4j libarary and configuring it can be found here: http://logging.apache.org/log4j/1.2/manual.html

Restful web calls

If restful web calls are made instead of using the command line tool, then the repository returns an XML fragment of the form:

<apiResult>

<status>created</status>

<recordId>989</recordId>

<message>created:989</message>

</apiResult>

In this fragment all of the elements are optional.

The record id element is the database id of the record affected.

The message is simply extra text: it's useful in debugging problems.

The status will be one of:

  • success

  • created

  • gone

  • updated

  • notfound

  • unauthorized

  • badrequest

  • unknownerror

  • notallowed

Examples

Example 1: Create a project via the command line

Title & description are mandatory fields.

project.xml (contained in the directory /Users/admin/Documents/CLTest/exampleOne/):

<?xml version="1.0" encoding="utf-8"?>

<tdar:project xmlns:tdar="http://www.tdar.org/namespace"

 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

 xsi:schemaLocation="http://core.tdar.org/schema/current">

 <tdar:description>This is a test project imported from the command line.</tdar:description>

 <tdar:title>Command line test project</tdar:title>

</tdar:project>

Command line:

java -cp ../ROOT/WEB-INF/lib/*:. org.tdar.utils.CommandLineAPITool -http -username benben -password ben -file /Users/admin/Documents/CLTest/exampleOne/ -host localhost:8080

Example 2: Create a project using a config file

project.xml (contained in the directory /Users/admin/Documents/CLTest/exampleOne/):

<?xml version="1.0" encoding="utf-8"?>

<tdar:project xmlns:tdar="http://www.tdar.org/namespace"

 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

 xsi:schemaLocation="http://core.tdar.org/schema/current">

 <tdar:description>This is a test project imported from the command line.</tdar:description>

 <tdar:title>Command line test project</tdar:title>

</tdar:project>

input.properties (contained in the same directory as the command line tool):

username=benben

password=ben

file=/Users/admin/Documents/CLTest/exampleOne/

host=localhost:8080

Command line:

java -cp ../ROOT/WEB-INF/lib/*:. org.tdar.utils.CommandLineAPITool -http -config input.properties

Example 3: Overriding a value in a config file

project.xml (contained in the directory /Users/admin/Documents/CLTest/exampleOne/):

<?xml version="1.0" encoding="utf-8"?>

<tdar:project xmlns:tdar="http://www.tdar.org/namespace"

 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

 xsi:schemaLocation="http://core.tdar.org/schema/current">

 <tdar:description>This is a test project imported from the command line.</tdar:description>

 <tdar:title>Command line test project</tdar:title>

</tdar:project>

input.properties (contained in the same directory as the command line tool):

username=benben

password=ben

file=/Users/admin/Documents/CLTest/exampleOne/

host=localhost:8080

Command line:

java -cp ../ROOT/WEB-INF/lib/*:. org.tdar.utils.CommandLineAPITool -http -config input.properties -host=115.146.92.150:8080

In this case the host is an entirely different machine.

Example 4: Uploading multiple files from the command line

input.properties (contained in the same directory as the command line tool):

username=benben

password=ben

file=/Users/admin/Documents/CLTest/exampleFour/One,/Users/admin/Documents/CLTest/exampleFour/Two

host=localhost:8080

projectid=6838

Note the multiple file entries separated by a comma.

Were this to be run from the command line, it would be:

java -cp ../ROOT/WEB-INF/lib/*:. org.tdar.utils.CommandLineAPITool -http -username benben -password ben -file /Users/admin/Documents/CLTest/exampleFour/Two -file /Users/admin/Documents/CLTest/exampleFour/One -host localhost:8080

In the first directory there is an image “melbourne.jpg”, and the file melbourne.xml.

In the second directory there is an image “sorrento.jpt, and the file sorrento.xml.

The contents of melbourne.xml are:

<?xml version="1.0" encoding="utf-8"?>

<tdar:image xmlns:tdar="http://www.tdar.org/namespace" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://localhost:8180/schema/current schema.xsd">

 <tdar:description>A transport nightmare</tdar:description>

 <tdar:resourceType>IMAGE</tdar:resourceType>

 <tdar:title>Melbourne</tdar:title>

 <tdar:date>2012</tdar:date>

 <tdar:dateNormalized>2012</tdar:dateNormalized>

 <tdar:externalReference>false</tdar:externalReference>

 <tdar:inheritingCollectionInformation>false</tdar:inheritingCollectionInformation>

 <tdar:inheritingCulturalInformation>false</tdar:inheritingCulturalInformation>

 <tdar:inheritingIdentifierInformation>false</tdar:inheritingIdentifierInformation>

 <tdar:inheritingInvestigationInformation>false</tdar:inheritingInvestigationInformation>

 <tdar:inheritingMaterialInformation>false</tdar:inheritingMaterialInformation>

 <tdar:inheritingNoteInformation>false</tdar:inheritingNoteInformation>

 <tdar:inheritingOtherInformation>false</tdar:inheritingOtherInformation>

 <tdar:inheritingSiteInformation>false</tdar:inheritingSiteInformation>

 <tdar:inheritingSpatialInformation>false</tdar:inheritingSpatialInformation>

 <tdar:inheritingTemporalInformation>false</tdar:inheritingTemporalInformation>

 <tdar:relatedDatasetData/>

</tdar:image>


The contents of the file sorrento.xl are:

<?xml version="1.0" encoding="utf-8"?>

<tdar:image xmlns:tdar="http://www.tdar.org/namespace" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://localhost:8180/schema/current schema.xsd">

 <tdar:description>A dream to visit</tdar:description>

 <tdar:resourceType>IMAGE</tdar:resourceType>

 <tdar:title>Sorrento</tdar:title>

 <tdar:date>2012</tdar:date>

 <tdar:dateNormalized>2012</tdar:dateNormalized>

 <tdar:externalReference>false</tdar:externalReference>

 <tdar:inheritingCollectionInformation>false</tdar:inheritingCollectionInformation>

 <tdar:inheritingCulturalInformation>false</tdar:inheritingCulturalInformation>

 <tdar:inheritingIdentifierInformation>false</tdar:inheritingIdentifierInformation>

 <tdar:inheritingInvestigationInformation>false</tdar:inheritingInvestigationInformation>

 <tdar:inheritingMaterialInformation>false</tdar:inheritingMaterialInformation>

 <tdar:inheritingNoteInformation>false</tdar:inheritingNoteInformation>

 <tdar:inheritingOtherInformation>false</tdar:inheritingOtherInformation>

 <tdar:inheritingSiteInformation>false</tdar:inheritingSiteInformation>

 <tdar:inheritingSpatialInformation>false</tdar:inheritingSpatialInformation>

 <tdar:inheritingTemporalInformation>false</tdar:inheritingTemporalInformation>

 <tdar:relatedDatasetData/>

</tdar:image>

Setting up the directory structure

Some more notes on setting up the directory structure:

Structure 1, a directory with multiple record files (no attachments)

/tmp/separateRecords/record1.xml

/tmp/separateRecords/record2.xml
What happens – each file is uploaded separately as a record into tDAR. If there are other files that are not XML records, they are ignored
Structure 2, a directory with a single record file and one or more attachments

/tmp/singleRecord/record1.xml

/tmp/singleRecord/img1.jpg
/tmp/singleRecord/img2.jpg
What happens – record1 is uploaded and img1 and img2 are uploaded and associated with it
Structure 3, a degenerate case

/tmp/bad/record1.xml

/tmp/bad/img1.jpg
/tmp/bad/record2.xml
What happens – record1 and record2 are uploaded, the image is ignored as the API Tool doesn't know what to do with it