Coded Character Sets

This section lists which character set you can use to specify the target character set. The coded character sets are enumerated in kvtypes.h and defined in the Filter class.

Code Character Sets

Coded Character Set Description Can be set as target charset?
KVCS_UNKNOWN Unknown character set N
KVCS_SJIS Japanese (uses multibyte encoding), cp932 Y
KVCS_GB Simplified Chinese (China, Singapore, Malaysia) cp936 Y
KVCS_BIG5 Traditional Chinese (Taiwan, Hong Kong, Macaw) cp950 Y
KVCS_KSC Korean, cp949 Y
KVCS_1250 Windows Latin 2 (Central Europe) Y
KVCS_1251 Windows Cyrillic (Slavic) Y
KVCS_1252 Windows Latin 1 (ANSI) Y
KVCS_1253 Windows Greek Y
KVCS_1254 Windows Latin 5 (Turkish) Y
KVCS_1255 Windows Hebrew Y
KVCS_1256 Windows Arabic Y
KVCS_1257 Windows Baltic Rim Y
KVCS_1258 Windows Vietnamese Y
KVCS_8859_1 ISO 8859-1 Latin 1 (Western Europe, Latin America) Y
KVCS_8859_2 ISO 8859-2 Latin 2 (Central Eastern Europe) Y
KVCS_8859_3 ISO 8859-3 Latin 3 (S.E. Europe) Y
KVCS_8859_4 ISO 8859-4 Latin 4 (Scandinavia/Baltic) Y
KVCS_8859_5 ISO 8859-5 Latin/Cyrillic Y
KVCS_8859_6 ISO 8859-6 Latin/Arabic Y
KVCS_8859_7 ISO 8859-7 Latin/Greek Y
KVCS_8859_8 ISO 8859-8 Latin/Hebrew Y
KVCS_8859_9 ISO 8859-9 Latin/Turkish Y
KVCS_8859_14 ISO 8859-14 Y
KVCS_8859_15 ISO 8859-15 Y
KVCS_437 DOS Latin US Y
KVCS_737 DOS Greek Y
KVCS_775 DOS Baltic Rim Y
KVCS_850 DOS Latin 1 Y
KVCS_851 DOS Greek Y
KVCS_852 DOS Latin 2 Y
KVCS_855 DOS Cyrillic Y
KVCS_857 DOS Turkish Y
KVCS_860 DOS Portuguese Y
KVCS_861 DOS Icelandic Y
KVCS_862 DOS Hebrew Y
KVCS_863 DOS Canadian French Y
KVCS_864 DOS Arabic Y
KVCS_865 DOS Nordic Y
KVCS_866 DOS Cyrillic Russian Y
KVCS_869 DOS Greek 2 Y
KVCS_874 Thai Y
KVCS_STDENC Adobe Standard Encoding N
KVCS_PDFDOC Adobe standard PDF character set N
KVCS_037 EBCDIC code page 037 Y
KVCS_1026 EBCDIC code page 1026 Y
KVCS_500 EBCDIC code page 500 Y
KVCS_875 EBCDIC code page 875 Y
KVCS_LMBCS Lotus multibyte character set Group 1 and Group 2 N
KVCS_UTF16 16-bit Unicode transformation format Y
KVCS_UTF8 8-bit Unicode transformation format Y
KVCS_UTF7 7-bit Unicode transformation format Y
KVCS_2022_JP ISO 2022-JP, Japanese mail and news safe encoding (JIS-7) N
KVCS_2022_CN ISO 2022-CN, Chinese mail and news safe encoding N
KVCS_2022_KR ISO 2022-KR, Korean mail and news safe encoding N
KVCS_WP6X Word Perfect 6.x and higher character mapping N
KVCS_10000 Western European (Macintosh) Y
KVCS_KSC5601 Unified Hangul Y
KVCS_GB2312 Simplified Chinese (China, Singapore, Hong Kong) Y
KVCS_GB12345 Traditional Chinese (China) - analogue of GB2312 Y
KVCS_CNS11643 Traditional Chinese - Taiwan. Supplement to Big5 Y
KVCS_JIS0201 Japanese - contains ASCII character set (JIS-Roman) N
KVCS_JIS0212 Japanese. Supplement to JIS0208. Y
KVCS_EUC_JP Japanese Extended UNIX Code Y
KVCS_EUC_GB Simplified Chinese Extended UNIX Code Y
KVCS_EUC_BIG5 Traditional Chinese Extended UNIX Code N
KVCS_EUC_KSC Korean Extended UNIX Code N
KVCS_424 EBCDIC Hebrew N
KVCS_856 PC Hebrew (old) N
KVCS_1006 IBM AIX Pakistan (Urdu) N
KVCS_KOI8R Cyrillic (Russian) Y
KVCS_PDF_JAPAN1 Adobe-Japan1-2 character collection N
KVCS_PDF_KOREA1 Adobe-Korea1-0 character collection N
KVCS_PDF_GB1 Adobe-GB1-3 character collection N
KVCS_PDF_CNS1 Adobe-CNS1-2 character collection N
KVCS_2022_JP_8 ISO 2022-JP, Japanese mail and news safe encoding (JIS8) N
KVCS_720 Arabic DOS-720 Y
KVCS_8859_10 ISO 8859-10 (Latin 6 Nordic) Y1The character set cannot be forced as output in Export SDK and Viewing SDK because the character set is not supported by the major browsers.
KVCS_8859_13 ISO 8859-13 (Latin 7 Baltic) Y 1
KVCS_57002 ISCII Devanagari (x-iscii-de) Y 1
KVCS_57003 ISCII Bengali (x-iscii-be) Y 1
KVCS_57004 ISCII Tamil (x-iscii-ta) Y1
KVCS_57005 ISCII Telugu (x-iscii-te) Y1
KVCS_57006 ISCII Assamese (x-iscii-as) Y1
KVCS_57007 ISCII Oriya (x-iscii-or) Y1
KVCS_57008 ISCII Kannada (x-iscii-ka) Y1
KVCS_57009 ISCII Malayalam (x-iscii-ma) Y1
KVCS_57010 ISCII Gujarathi (x-iscii-gu) Y1
KVCS_57011 ISCII Panjabi (x-iscii-pa) Y 1
KVCS_GB18030b2 Reserved for internal use n/a
KVCS_GB18030 GB18030 (Chinese 4-byte character set) Y
KVCS_8859_11 ISO 8859-11 (Thai) Y
KVCS_8859_16 ISO 8859-16 (Latin-10 South-Eastern Europe) Y
KVCS_ARABICMAC Arabic Mac (x-mac-arabic) Y
KVCS_KOI8U Cyrillic (KOI8U Ukrainian) Y
KVCS_HZGB2312 The 7-bit representation of GB 2312 / RFC 1842 n/a