Create Coding Files

The coding files are simple files that describe the property and entity codes in your Fact Bank system. It also defines any aliases for any of the entities and properties, and maps all aliases to the same code.

A Fact Bank system requires four coding files:

The following sections use a simple example to show how to create the coding files from your data.

Example Data

The example data starts with facts, organized in a table. This version uses CSV format:

product_name, color, buy_price, sell_price, sold_last_year
alpha, red, 10, 12, 3500000
beta, blue, 11, 13, 2000000
gamma, green, 9, 10, 1000000

For this example, you might want to be able to answer questions such as:

Generate the Property Code Files

The properties in your data are the values that you want to find in the Fact Bank. For a table like the one in this example, the properties are the columns in the table.

The code_to_property.txt coding file assigns a unique code for each property. This coding file also defines the canonical human-readable name for the property, and sets its type (string or time). Time properties must be in the ISO format YYYY-MM-DDTHH:NN:SS.

NOTE:

If your data values contain punctuation characters, such as commas (,) and equals signs (=), you must percent-encode the value in the coding files. For example, use %3D for an equals sign.

By convention, when you use an IDOL Content component as your Fact Store, simple properties have a code that begins with S. However, see More Complicated Data for information about what to do when a property is also an entity. This prefix is not required if you use a SQL backend.

For example, the following sample is the code_to_property.txt file for the example data in the previous section.

SPRODUCT_NAME=product,string
SCOLOR=color,string
SBUY_PRICE=buying price,string
SSELL_PRICE=selling price,string
SSOLD_LAST_YEAR=sold last year,string

The property_to_code.txt coding file contains the inverse mapping of the code_to_property.txt file, without the type information. You can also include aliases for a value, on a separate line.

For example, the following sample is the property_to_code.txt file for the example data. It includes the alias sale price for the SSELL_PRICE code.

product=SPRODUCT_NAME
color=SCOLOR
buying price=SBUY_PRICE
purchase price=SBUY_PRICE
selling price=SSELL_PRICE
sale price=SSELL_PRICE
sold last year=SSOLD_LAST_YEAR

Generate the Entity Code Files

The entities are the things that you want to find the property values for. For the example table, the obvious choice is the product_name.

The code_to_entity.txt coding file assigns a unique code to each entity. When you use an IDOL Content Component as your Fact Store, all your entity codes must start with a Q to match the fact store configuration. This prefix is not required if you use a SQL backend.

NOTE:

If your data values contain punctuation characters, such as commas (,) and equals signs (=), you must percent-encode the value in the coding files. For example, use %3D for an equals sign.

For example, the following sample is the code_to_entity.txt for the example data.

QALPHA=alpha
QBETA=beta
QGAMMA=gamma

The entity_to_code.txt coding file contains the inverse mapping of the code_to_entity.txt. You can also include aliases for the entity names, on a separate line.

For example, the following sample is the entity_to_code.txt for the example data. It includes the alias alpha one for the QALPHA code.

alpha=QALPHA
beta=QBETA
gamma=QGAMMA
alpha one=QALPHA

More Complicated Data

The examples in this section are for very simple data. In some cases, however, you might have more complex data where you have multiple entities in a row, or an entity that is also a property.

For example, if you have a database of company information, including members of the board, and executives, you might have a table of the following form:

company_name,CEO,share_price
MyCompany,Jane Smith,12.00

In this case, you can have company_name as your entity, and CEO and share_price are the properties of that company. This would allow you to answer questions such as Who is the CEO of MyCompany?

However, CEO is also an entity. You might want to ask questions such as Which company is Jane Smith CEO of?

In this case, you define CEO as a property in the code_to_property.txt and property_to_code.txt files, but use a code value that starts with a Q.

For example:

SSHARE_PRICE=share price,string
QCEO=CEO,string

You then use the QCEO code in the Fact Store content as the property code. For more information about how to generate and index the content, see Generate the Fact Store Data.

If you are using a SQL database as your Fact Store, the Q prefix is not required, and you define the property as normal.


_HP_HTML5_bannerTitle.htm