Collect

This action retrieves documents from a repository and sends the documents to a specified location. You can save the documents to disk or add them to the ingest queue. You must specify the documents that you want to collect by their identifiers.

Type: Asynchronous

Parameter Name Description Required
CollectActions

A list of actions to perform on documents before they are transferred to their destination. The list is processed from left to right. Specify the actions in the form action:parameters. The available actions are:

  • META. Add a custom field to the document.
    META:Fieldname=FieldValue
  • ZIP. Add the document to a zip file.
    ZIP:Filename[:Password]
  • LUA. Run a Lua script on the document.
    LUA:Luascript.

For example, to add a field CATEGORY=FILESYSTEM to every document, zip all documents with a password, and add a field COLLECTTIME=1234567890 to the zip, specify the collect action as:

CollectActions=META:CATEGORY=FILESYSTEM,ZIP:Output.zip:password,
META:COLLECTTIME=1234567890

Escape any commas in the action parameters with a backslash (\).

No
Config A base-64 encoded configuration. The configuration parameters that are set override the same parameters in the connector's configuration file. No
Destination

The output destination as a UNC Path. If you don't set a destination, the documents are added to the ingest queue.

The parameter can use fields from the document or identifier to construct the resulting destination for each document.

To add a document field value as part of the destination, use the tag <DOC:FIELDNAME> within the string. To add an identifier field value as part of the destination, use the tag <ID:FIELDNAME> within the string. For example:

destination=\\server\share\<ID:SOURCE>\<DOC:OWNER>

Where a field can have multiple values or is a comma-separated list, multiple destinations are created and each gets a copy of the document. You can specify a comma-separated list by preceding the colon with the comma-separated list separator character. For example: <ID,:SOURCE>.

No
failedDirectory The directory in which the action reports failures. No
Identifiers A comma-separated list of identifiers to specify the documents to collect. Required unless IdentifiersXML is set.
IdentifiersXML

This parameter can be specified instead of the Identifiers parameter. It specifies identifiers of the documents to collect, along with a set of custom metadata to associate with each collected document.

The data must be provided in XML format as below:

<identifiersXML>
  <identifier value="[AUTN_IDENTIFIER1]">
     <metadata name="[CustomField1]"  value="[CustomFieldValue1_1]"/>
     <metadata name="[CustomField1]"  value="[CustomFieldValue1_2]"/>
     <!-- ... -->
  </identifier>
  <identifier value="[AUTN_IDENTIFIER2]">
     <metadata name="[CustomField1]"  value="[CustomFieldValue2_1]"/>
     <!-- ... -->
  </identifier>
  <!-- ... -->
</identifiersXML>
Required unless Identifiers is set.
Override_Config_Parameters

Any other action parameters that you set override settings in the connector's configuration file. For example:

/action=fetch&fetchaction=...
&[Section]Parameter=Value

where [Section] (optional) is the name of a configuration file section, Parameter is the name of a configuration parameter, and Value is the parameter value.

No

Example

http://localhost:1234/action=Fetch&FetchAction=Collect
                                  &Identifiers=...
                                  &Destination=C:\collected

Response

As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the QueueInfo action) appears below.

In this example, the identifiers for both documents appear between <success> tags showing that they were collected successfully. The documents were output to C:\collected along with stub files containing their metadata.

<action>
  <documentcounts>
    <documentcount
       added="0"          collected="2"       deleted="0"
	errors="0"         holds="0"           ingestadded="0"
	ingestdeleted="0"  ingestfailed="0"    ingestupdated="0"
	inserted="0"       releasedholds="0"   seen="0"
	task="DIR1"        unchanged="0"       updated="0"/>
  </documentcounts>
  <fetchaction>COLLECT</fetchaction>
  <tasks>
    <success>
      PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxmaWxlOC50eHQiLz4=
    </success>
    <success>
      PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxmaWxlOS50eHQiLz4=
    </success>
  </tasks>
  <token>MTAuMi4xMDUuMzQ6MTIzNDpGRVRDSDotMTI2NTE0MTI5NA==</token>
  <status>Finished</status>
  <queued_time>2009-Oct-15 16:02:53</queued_time>
  <time_in_queue>0</time_in_queue>
  <process_start_time>2009-Oct-15 16:02:53</process_start_time>
  <time_processing>0</time_processing>
  <process_end_time>2009-Oct-15 16:02:53</process_end_time>
</action>

If a document cannot be collected successfully, the document identifier appears between <failed> tags and a message explains the reason for the failure:

<tasks>
  <failed message="Error message">
    PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxmaWxlOS50eHQiLz4=
  </failed>
</tasks>

_HP_HTML5_bannerTitle.htm