The parse_document_csv function parses a CSV file into documents and calls a function on each document.

This function can handle CSV files with or without a header row, but if a header row is not present you must:


parse_document_csv( filename, handler [, params ] )


Argument Description
filename (string) The path and file name of the CSV file to parse into documents.
handler (document_handler_function) The function to call on each document that is parsed from the CSV file.
params (table) A table of named parameters to configure parsing. The table maps parameter names (String) to parameter values. For information about the parameters that you can set, see the following table.

Named Parameters

Named Parameter Description
content_field (string, default DRECONTENT) The name of the field, in the CSV file, to use as the document content.
csv_field_names (string list) A list of names for the fields that exist in the CSV file. This overrides any header row, if one is present.
reference_field (string, default DREREFERENCE) The name of the field, in the CSV file, to use as the document reference.
use_header_row (boolean, default TRUE) Specify whether the CSV file includes a header row (whether the first row is a list of field names and not values). If this parameter is True and you do not set csv_field_names, the field names in the header row are used as the names of the document fields.


The following example parses a CSV file named data.csv, and calls the function documentHandler on each document. The values in the field item_id become document references and the values in the field body become document content.

function documentHandler(document)
  -- do something, for example


parse_document_csv("./data.csv", documentHandler, {

The following example shows how to provide field names when there is no header row in the CSV file:

parse_document_csv("./data_no_header.csv", documentHandler, {
        csv_field_names={"DREREFERENCE", "title", "modified", "DRECONTENT"}