Parameter and Command Reference > Connector Framework Server Parameters > Import Tasks and their Parameters > TextToDocs Import Task Parameters

TextToDocs Import Task Parameters
The parameters in this section are used to customise the TextToDocs Import Task.
FilenameMatchesRegex<N>
The FilenameMatchesRegex<N> parameter is used to restrict the files processed by a TextToDocs import task. This parameter accepts one or more regular expressions. If the file name does not match one of the regular expressions, the file is not included.
 
ReferenceMatchesRegex<N>
The ReferenceMatchesRegex<N> parameter is used to restrict the files processed by a TextToDocs import task. This parameter accepts one or more regular expressions. If the document reference does not match one of the regular expressions, the file is not included.
 
FieldMatchesName<N>
The FieldMatchesName<N> and FieldMatchesRegex<N> parameters are used to restrict the files processed by a TextToDocs import task.
If the content of the field specified by the FieldMatchesName<N> parameter does not match the regular expression in the FieldMatchesRegex<N> parameter, the file is not included.
If more than one pair of FieldMatchesName and FieldMatchesRegex parameters is defined, the field content must match the regular expression for every field, or the file is not included.
 
FieldMatchesRegex<N>
The FieldMatchesName<N> and FieldMatchesRegex<N> parameters are used to restrict the files processed by a TextToDocs import task.
If the content of the field specified by the FieldMatchesName<N> parameter does not match the regular expression in the FieldMatchesRegex<N> parameter, the file is not included.
If more than one pair of FieldMatchesName and FieldMatchesRegex parameters is defined, the field content must match the regular expression for every field, or the file is not included.
 
ContentContainsRegex<N>
The ContentContainsRegex<N> parameter is used to restrict the files processed by a TextToDocs import task. This parameter accepts one or more regular expressions.
If the document content does not match all of the regular expressions that are defined, the file is not included.
 
MainRangeRegex<N>
The MainRangeRegex<N> parameter is used to define the main part of a document. The main part of the document includes the content and all of the fields that are extracted to the main document. This parameter returns the entire document by default.
This parameter accepts one or more regular expressions. The regular expressions can contain sub-matches (enclosed in parentheses). If multiple matches are found, the content is concatenated.
For example, to define the main part of the document as all content that is enclosed by <html> </html> tags, set the parameter to:
MainRangeRegex0=<html>(.*)</html>
 
MainContentRegex<N>
The MainContentRegex<N> parameter is used to define the content that is extracted as the main document content. The content must be located within the range defined by the MainRangeRegex parameter.
This parameter accepts one or more regular expressions. The regular expression can contain sub-matches (enclosed in parentheses). If multiple matches are found, the content is concatenated (separated by new line characters).
For example, to define the main document content as all content that is enclosed by <p> </p> tags, set the parameter to:
MainContentRegex0=<p>(.*)</p>
 
MainFieldName<N>
The MainFieldName<N> and MainFieldRegex<N> parameters are used to name and populate a document field within the main document.
The document field named in the MainFieldName<N> parameter is populated by the content identified by the MainFieldRegex<N> parameter. Each pair of parameters produces a single document field.
 
MainFieldRegex<N>
The MainFieldName<N> and MainFieldRegex<N> parameters are used to name and populate a document field within the main document.
The document field named in the MainFieldName<N> parameter is populated by the content identified by the MainFieldRegex<N> parameter. Each pair of parameters produces a single document field.
The data used to populate the field must be within the range defined by the MainRangeRegex parameter.
The regular expressions used in the MainFieldRegex parameter can contain sub-matches (enclosed in parentheses). If multiple sub-matches are found, the sub-matches are concatenated (separated by spaces).
 
ChildrenRangeRegex<N>
The ChildrenRangeRegex<N> parameter is used to define part of a document that is split into one or more child documents. The part of the document identified should include the content and all of the fields to be extracted into the child documents. This parameter returns the entire document by default.
This parameter accepts one or more regular expressions. The regular expression can contain sub-matches (enclosed in parentheses). If there are multiple matches, the content is concatenated.
For example, to define the all content that is enclosed by <html> </html> tags, set the parameter to:
ChildrenRangeRegex0=<html>(.*)</html>
 
ChildRangeRegex<N>
The ChildRangeRegex<N> parameter is used to define part of a document that is split into a single child document. The part of the document identified should include the content and all of the fields to be extracted into a single child document. The content must be in the range identified by the ChildrenRangeRegex parameter.
A child document is produced for every match to a regular expression.
The regular expressions used in the ChildRangeRegex parameter can contain sub-matches (enclosed in parentheses). If multiple sub-matches are found, the sub-matches are concatenated.
 
ChildContentRegex<N>
The ChildContentRegex<N> parameter is used to define the content of a single child document.
This parameter accepts one or more regular expressions. The content identified must be within the range defined by the ChildRangeRegex parameter.
The regular expressions used in the ChildContentRange parameter can contain sub-matches (enclosed in parentheses). If multiple matches are found, the content is concatenated (separated by new line characters).
For example, to define all content that is enclosed by <p> </p> tags, set the parameter to:
ChildContentRange0=<p>(.*)</p>
 
ChildFieldName<N>
The ChildFieldName<N> and ChildFieldRegex<N> parameters are used to name and populate a document field within a child document.
The document field named in the ChildFieldName<N> parameter is populated by the content identified by the ChildFieldFieldRegex<N> parameter. Each pair of parameters produces a single document field.
 
ChildFieldRegex<N>
The ChildFieldName<N> and ChildFieldRegex<N> parameters are used to name and populate a document field within a child document.
The document field named in the ChildFieldName<N> parameter is populated by the content identified by the ChildFieldFieldRegex<N> parameter. Each pair of parameters produces a single document field.
The data used to populate the field must be within the range defined by the ChildRangeRegex parameter.
The regular expressions used in the ChildFieldRegex parameter can contain sub-matches (enclosed in parentheses). If multiple sub-matches are found, they are concatenated (separated by spaces).
 
ChildInheritFields
The ChildInheritFields parameter is used to specify a comma-separated list of field names that are inherited by the child documents from the original (not the main) document.
 
ContentReplaceRegex<N>
The ContentReplaceRegex<N> and ContentReplaceFormat<N> parameters are used to find and replace data in a document.
The data identified by the ContentReplaceRegex<N> parameter is replaced by the string specified in the ContentReplaceFormat<N> parameter. The replacement affects the DRECONTENT of the main and child documents.
 
ContentReplaceFormat<N>
The ContentReplaceRegex<N> and ContentReplaceFormat<N> parameters are used to find and replace data in a document.
The data identified by the ContentReplaceRegex<N> parameter is replaced by the string specified in the ContentReplaceFormat<N> parameter. The replacement affects the DRECONTENT of the main and child documents.
 
FieldReplaceName<N>
The FieldReplaceName<N>, FieldReplaceRegex<N>, and FieldReplaceFormat<N> parameters are used to identify and replace data within a document field.
The FieldReplaceName<N> parameter identifies the document field to be searched. This parameter must be followed by FieldReplaceRegex<N> and FieldReplaceFormat<N> parameters.
 
FieldReplaceRegex<N>
The FieldReplaceName<N>, FieldReplaceRegex<N>, and FieldReplaceFormat<N> parameters are used to identify and replace data in a document field.
The FieldReplaceRegex<N> parameter defines the part of the field that should be replaced, using a regular expression.
 
FieldReplaceFormat<N>
The FieldReplaceName<N>, FieldReplaceRegex<N>, and FieldReplaceFormat<N> parameters are used to identify and replace data in a document field.
The FieldReplaceFormat<N> parameter specifies a string value that should replace the data specified in the FieldReplaceRegex<N> parameter.
 
DateFieldName<N>
The DateFieldName<N> and DataFieldFormat<N> parameters are used to identify a date in a document field and replace it with the date in a standard date format.
The DateFieldName<N> parameter is used to identify the field.
 
DateFieldFormat<N>
The DateFieldName<N> and DateFieldFormat<N> parameters are used to identify a date in a document field and replace it with the date in a standard date format.
The DateFieldFormat<N> parameter is used to define the formatting of the date.