Tasks Configuration File Sections

The IDOL Speech Server tasks configuration file (speechserver-tasks.cfg) contains the following sections.

[TaskTypes] [Resources]  
[MyTask] [MyLanguage]  
[ModuleName] [MyFPDB]  

For details of these sections and the parameters for each section, see the IDOL Speech Server Reference. The following sections describe the general configuration sections.

[TaskTypes] Section

The [TaskTypes] section lists the tasks that are configured in the IDOL Speech Server. You must create a [MyTask] configuration section for each task type listed in the [TaskTypes] section.

[TaskTypes]

// Speech to text
0=speechToText
1=speechToTextFilter
2=speechToTextTelephony
3=punctuateCtm
// Speaker cluster processing
8=ClusterSpeech
9=ClusterSpeechTel
10=ClusterSpeechToTextTel

// Transcript analysis
11=TranscriptAlign
12=TranscriptCheck
13=Scorer

// Language model building
14=LanguageModelBuild
15=TextNorm

// Speaker identification
16=ivSpkId
17=ivSpkIdFeature
18=ivSpkIdTrain
19=ivSpkIdTrainAudio
20=ivSpkIdDevel
21=ivSpkIdDevelAudio
22=ivSpkIdDevelFinal
23=ivSpkIdSetAdd
24=ivSpkIdSetDelete
25=ivSpkIdEditThresh
26=ivSpkIdInfo

[MyTask] Sections

The [MyTask] sections define configuration options for each IDOL Speech Server audio processing task. You must create a [MyTask] section for each task you have listed in the [TaskTypes] section.

Each section contains details of the schema you use as well as any other parameters required for the task.

[speechToText]
0 = a,ts <- audio(MONO, input)
1 = f  <- frontend(_, a)
2 = nf <- normalizer(_, f)
3 = w1  <- stt(_, nf)
4 = w2  <- postproc(_, w1)
5 = output <- wout(_, w2, ts)
defaultResults = out
[TranscriptAlign]
0 = w <- ctm2(READ, input)
1 = w2 <- align2(ALIGN, w)
2 = output <- wout2(_, w2)
DefaultResults=Out

[ModuleName] Sections

The [ModuleName] configuration sections contain settings for the modules. Create a configuration section for each module that you use in the [MyTask] configuration sections. Each configuration section must have the same name as the module referenced in the task schemas. If you use more than one configuration of a module, create a section for each configuration, including any numerical suffixes.

You can set configuration parameters in the individual module configuration sections to variable values. You can use these values to create action parameters that allow you to specify the value of the configuration parameter when you create a task. You can refer the values of all similar configuration parameters to a single configuration parameter where you set a standard value. For details, see Configure Variable Parameters.

[audio]
inputType = $params.inputType
file = $params.file
streamMode = $stt.mode
sampleFrequency = $stt.lang.sampleFrequency
sugdInputFrequency = $params.sugdInputFrequency
sugdInputChannels = $params.sugdInputChannels
startTime = $params.startTime
endTime = $params.endTime

[Resources] Section

The [Resources] section lists the resources that IDOL Speech Server requires, including language packs and AFP databases. You must create a [MyLanguage] configuration section for each language pack, and a [MyFPDB] configuration section for each Audio Fingerprint database listed in the [Resources] section.

[Resources]
0=ENUK
1=ENUS
2=fpdb:AFP
3=fpdb:ADVERTS
4=FRFR
5=DEDE
6=ARMSA
7=fpdb:AFP

[MyLanguage] Sections

The [MyLanguage] sections contains settings for language packs that you have defined in the [Resources] section. You must create a [MyLanguage] section for each language that you have listed in the [Resources] section.

[ENUK]
PackDir=ENUK
Pack=ENUK-6.3
SampleFrequency=16000
AmFile = T:\LP\ENUK\ver-ENUK-5.0-16k.am
CustomLM=$params.CustomLM
CustomDct=myDictionary.dct.sz
DNNFile = $params.DNNFile
ClassWordFile = $params.ClassWordFile
PronFile = $params.PronFile
[ENUS]
PackDir=ENUS
Pack=ENUS-6.3
SampleFrequency=16000
AmFile = T:\LP\ENUK\ver-ENUK-5.0-16k.am
CustomLM=$params.CustomLM
CustomDct=myDictionary.dct.sz
DNNFile = $params.DNNFile
[FRFR]
PackDir=FRFR
Pack=FRFR-6.3
SampleFrequency=16000
AmFile = T:\LP\ENUK\ver-ENUK-5.0-16k.am
CustomLM=$params.CustomLM
CustomDct=myDictionary.dct.sz
DNNFile = $params.DNNFile

[MyFPDB] Sections

The Audio Fingerprint database (fpdb) configuration sections contain settings for the databases used in IDOL Speech Server.

You can set configuration parameters in these sections to variable values. You can use these values to create action parameters that allow you to specify the value of the configuration parameter when you create a task. For example, the following database configuration allows you to specify which database (the directory it is in, and the base file name of the database) on the command line (using the PackDir and Pack parameters):

[AFPDatabase]
PackDir = $params.packdir
Pack = $params.pack
FxxCacheSize=2
TtxCacheSize=200

Alternatively you can explicitly set these values in the configuration file, and specify a particular database:

[ADVERTS]
PackDir = C:\databases
Pack = adverts
FxxCacheSize=2
TtxCacheSize=200

You must list all Audio Fingerprint database resources in the [Resources] section before you use them. In this list, prefix the resource name with fpdb:.

[MySidBase] Section

The speaker identification base pack (sidbase) configuration sections contain details of the sid base pack that you want to use for speaker identification. This resource contains details of all the speaker identification base files. If you configure a base pack and set the SpkIdBasePack configuration parameter in the speaker identification modules, IDOL Speech Server can automatically find the base files for the speaker identification tasks, and you do not have to specify the base files explicitly.

You must configure the directory and version number for the base pack. For example:

[SIDBASE]
PackDir = SpeakerIdPack
Pack = gen-1.8

In this case, the PackDir is relative to the SpeakerID global directory, which is configured in the SpeakerIDDir configuration parameter. If you have not configured a SpeakerID global directory, the directory is relative to the main server install directory.

You must list the speaker identification base pack resource in the [Resources] section before you use it. In this list, you must prefix the resource name with sidbase:.


_HP_HTML5_bannerTitle.htm