Grammar Reference > Create and Edit Grammar Files

Create and Edit Grammar Files

Topics in this Section

An Eduction grammar defines patterns for matching text in a document. A pattern is a combination of characters and operators. An operator is a sequence of special characters that match text by following the rules associated with the operator.
 
Match either Smith or John
Match a sequence of three characters in the range 0 through 9
In the above example, the square bracket operators [] are used to match on any of the characters 0 through 9 and the curly braces {} are used to repeat the previous pattern three times.
Grammars are described using XML. The template that defines the XML that Eduction understands is contained in the file edk.dtd. When writing grammars for Eduction, HP Autonomy recommends that you reference edk.dtd at the start of the XML grammar file using the include statement, and that you use a DTD-compatible XML authoring tool to eliminate syntax errors and save time. Here is an example of a simple Eduction grammar:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE grammars SYSTEM "edk.dtd">
<grammars>
  <grammar name="mygrammar">
    <entity name="name" type="public">
      <pattern>Smith|John</pattern>
    </entity>
    <entity name="digits" type="public">
      <pattern>[0-9]{3}</pattern>
    </entity>
  </grammar>
</grammars>
 
This grammar defines two entities: mygrammar/name and mygrammar/digits.