Wildcard Searches in Japanese, Chinese, Korean, and Thai

Asian languages do not include spaces or word boundaries. For this reason the IDOL Content component applies sentence breaking to Asian text when it processes it, to split the text into individual words or terms.

You can carry out Wildcard searches in Japanese, Chinese, Korean, and Thai, provided that you query the IDOL Content component with one or more terms rather than a single string in which words are not delimited by spaces.

The question mark (?) Wildcard might not behave as expected, because it represents a single character, and each Asian letter actually consists of multiple characters (usually two). For example, if you want to use a ? single-character Wildcard in a multibyte language query, you must use one ? character for each byte (for example, ??? for a single Japanese character).