iconv_ibmkanji man page on DigitalUNIX

iconv_ibmkanji man page on DigitalUNIX
Man page or keyword search:
man Server 12896 pages
apropos Keyword Search (all sections)
Output format
iconv_ibmkanji(5)					     iconv_ibmkanji(5)

NAME
       iconv_ibmkanji  -  Specification for controlling conversion between IBM
       Kanji and Tru64 UNIX Japanese codesets

DESCRIPTION
       The iconv utility supports the ability to convert the encoding of char‐
       acters  between	IBM Kanji System Characters (IBM Kanji) and one of the
       following Tru64 UNIX codesets: DEC Kanji,  Super	 DEC  Kanji,  Japanese
       EUC,  or Shift JIS. You choose the type of conversion by specifying the
       appropriate values for the utility's from-code and to-code  parameters,
       as follows:

       ─────────────────────────────────────────────────────
       Type of Code Conversion	      from-code	  to-code
       ─────────────────────────────────────────────────────
       IBM Kanji to DEC Kanji	      ibmkanji	  deckanji
       IBM Kanji to Super DEC Kanji   ibmkanji	  sdeckanji
       IBM Kanji to Japanese EUC      ibmkanji	  eucJP
       IBM Kanji to Shift JIS	      ibmkanji	  SJIS
       DEC Kanji to IBM Kanji	      deckanji	  ibmkanji
       Super DEC Kanji to IBM Kanji   sdeckanji	  ibmkanji
       Japanese EUC to IBM Kanji      eucJP	  ibmkanji
       Shift JIS to IBM Kanji	      SJIS	  ibmkanji
       ─────────────────────────────────────────────────────

       Conversion  behavior for the following items is affected by the defini‐
       tion of environment variables or profile entries in the user's environ‐
       ment.  For  more information, see the “Environment Variables” and “Pro‐
       file” sections.	The UDC (User-Defined Character) mapping table that is
       used for UDC conversion

	      This  table must be an ASCII text file that contains UDC mapping
	      information.  The table affects conversion of user-defined char‐
	      acters  between  the  codesets.	The  EBCDIC  to/from  ISO code
	      (ASCII, JIS Roman characters) mapping table  that	 is  used  for
	      conversion

	      This  table must be ASCII text file that contains information on
	      how to map characters between EBCDIC and ISO code.  The  K-shift
	      code

	      This  is	a  one-	 or  two-byte  hexadecimal code that marks the
	      beginning of Kanji mode.	The A-shift code

	      This is a one- or	 two-byte  hexadecimal	code  that  marks  the
	      beginning of EBCDIC mode.	 The status of the initial mode (Kanji
	      or EBCDIC) at the time iconv command starts or  the  first  time
	      the  iconv()  function  is called after calling the iconv_open()
	      function that initializes the converter in a program

	      The status keywords are either kanji_mode or  ebcdic_mode.   How
	      to  treat	 undefined characters when these are detected in Kanji
	      mode

	      Specify this action by using one of the following keywords: Stop
	      codeset conversion.  Output the undefined characters without any
	      processing and  continue	codeset	 conversion.   Output  padding
	      characters  instead  of  the  undefined  characters and continue
	      codeset conversion.  Ignore the undefined	 characters  and  con‐
	      tinue  codeset  conversion.  The two-byte padding character used
	      in Kanji mode

	      This value is meaningful when replace is chosen for the process‐
	      ing  of  undefined characters in Kanji mode. Specify the padding
	      character by its hexadecimal  value.   How  to  treat  undefined
	      characters when these are detected in EBCDIC mode

	      Specify this action by using one of the following keywords: Stop
	      codeset conversion.  Output the undefined characters without any
	      processing  and  continue	 codeset  conversion.	Output padding
	      characters instead of  the  undefined  characters	 and  continue
	      codeset  conversion.   Ignore  the undefined characters and con‐
	      tinue codeset conversion.	 The one-byte padding  character  used
	      in EBCDIC mode

	      This value is meaningful when replace is chosen for the process‐
	      ing of undefined characters in EBCDIC mode. Specify the  padding
	      character by its hexadecimal value.

       When the to-code parameter for the conversion is ibmkanji, you can also
       specify the following items for conversion behavior: Whether  the  ini‐
       tial  shift  code is output at the start of conversion if the status of
       the initial mode (Kanji or EBCDIC) is different from the	 mode  of  the
       first input character

	      The  start  of  conversion  is the time the iconv utility starts
	      processing, or when the iconv() function is  called  just	 after
	      opening the converter with iconv_open(). Keyword values for this
	      item are yes or no.  Whether or not the utility outputs the last
	      shift  code  when	 iconv()  is  called  with a zero length input
	      string, and the current mode (Kanji or EBCDIC) is different from
	      the mode specified by the last shift state

	      Keyword  values  for  this  item are yes or no.  The last status
	      (Kanji mode or EBCDIC mode)

	      Specify kanji_mode or ebcdic_mode for this value. It is meaning‐
	      ful only when yes is the setting for whether the utility outputs
	      the last shift code.

       If the items that control conversion behavior  are  specified  by  both
       environment  variables  and the profile file, values set by environment
       variables override values set by comparable  entries  in	 the  profile.
       Note  that  values for all conversion control items are case-sensitive,
       whether they are set by environment variables or in  the	 profile.  The
       following table contains the default values for each conversion control
       item:

       ────────────────────────────────────────────────────
       Conversion Control Item		     Default Value
       ────────────────────────────────────────────────────
       UDC mapping table		     None
       K shift code			     0x0e
       A shift code			     0x0f
       Initial state			     ebcdic_mode
       Processing for undefined characters
       in Kanji mode			     abort
       Processing for undefined characters
       in EBCDIC mode			     pass
       ────────────────────────────────────────────────────

       The default padding characters are white spaces, whose code values  for
       each  destination  codeset are noted in the following table. These pad‐
       ding characters are output when you specify replace for	processing  of
       undefined  characters and do not explicitly specify the padding charac‐
       ter.

       ───────────────────────────────────────────────────
       Mode	     Default Value   Destination Codeset
       ───────────────────────────────────────────────────
       Kanji mode    0x44e9	     ibmkanji
		     0xa1a1	     deckanji, sdeckanji,
				     or eucJP
		     0x8140	     SJIS
       EBCDIC mode   0x40	     ibmkanji
		     0x20	     deckanji, sdeckanji,

				     eucJP, or SJIS
       ───────────────────────────────────────────────────

       The default EBCDIC-ISO mapping table is as follows; For conversion from
       IBM	      Kanji	       to	     other	     codesets:
       /usr/lib/nls/loc/iconv/data/ebcdic_kana.tbl For conversion  from	 other
       codesets to IBM Kanji: /usr/lib/nls/loc/iconv/data/kana_ebcdic.tbl

       These  mapping  tables map both EBCDIC and ISO code, which includes JIS
       Roman characters. The kana_ebcdic.tbl mapping table also maps ISO  low‐
       ercase characters to EBCDIC uppercase characters.

       The  following default values for conversion control items are meaning‐
       ful when the iconv utility's to-code conversion parameter is ibmkanji:

       ─────────────────────────────────────────────
       Conversion Control Item		Default
       ─────────────────────────────────────────────
       Output the initial shift code?	yes
       Output the last shift code?	yes
       Output the last status?		ebcdic_mode
       ─────────────────────────────────────────────

   Environment Variables
       This section discusses the environment variables that you  can  set  to
       control	conversion  behavior.  The names for these variables adhere to
       the following format:

       fromcode_tocode_controlitem

       The name segments for fromcode or tocode can be one  of	the  following
       key words:

       ────────────────────────────
       For Codeset:	 Use:
       ────────────────────────────
       IBM Kanji	 IBMKANJI
       DEC Kanji	 DECKANJI
       Super DEC Kanji	 SDECKANJI
       Japanese EUC	 EUCJP
       Shift JIS	 SJIS
       ────────────────────────────

       The name segments for controlitem can be one of the following keywords:

       ────────────────────────────────────────────────────────
       For Control Item:		    Use:
       ────────────────────────────────────────────────────────
       UDC mapping table		    UDC_TABLE
       EBCDIC-ISO mapping table		    EBCDIC_TABLE
       K shift code			    K_SHIFT_CODE
       A shift code			    A_SHIFT_CODE
       Initial state			    INITIAL_STATE
       Processing of undefined characters
       in Kanji mode			    KANJI_EXCEPT_PROC
       Processing of undefined characters
       in EBCDIC mode			    EBCDIC_EXCEPT_PROC
       Padding characters
       in Kanji mode			    PADDING_2BYTE_CHAR
       Padding characters
       in EBCDIC mode			    PADDING_1BYTE_CHAR
       Output initial
       shift code			    INITIAL_SHIFT_CODE
       Output last
       shift code			    TRAILER_SHIFT_CODE
       Last status			    LAST_STATE
       File path of the profile		    PROFILE
       ────────────────────────────────────────────────────────

       Following  are  examples	 of using the setenv C shell command to define
       environment variables to control conversion behavior.  In  these	 exam‐
       ples,  the  fromcode name segment indicates Japanese EUC and the tocode
       name segment indicates IBM Kanji:

       setenv	 EUCJP_IBMKANJI_UDC_TABLE    eucjp_ibmkanji_udc.tbl	setenv
       EUCJP_IBMKANJI_EBCDIC_TABLE	      kana_ebcdic.tbl		setenv
       EUCJP_IBMKANJI_K_SHIFT_CODE  0x0e  setenv   EUCJP_IBMKANJI_A_SHIFT_CODE
       0x0f    setenv	 EUCJP_IBMKANJI_INITIAL_STATE	 ebcdic_mode	setenv
       EUCJP_IBMKANJI_KANJI_EXCEPT_PROC		    replace		setenv
       EUCJP_IBMKANJI_EBCDIC_EXCEPT_PROC  replace  setenv  EUCJP_IBMKANJI_PAD‐
       DING_2BYTE_CHAR 0x44e9  setenv  EUCJP_IBMKANJI_PADDING_1BYTE_CHAR  0x40
       setenv	     EUCJP_IBMKANJI_INITIAL_SHIFT_CODE	      yes	setenv
       EUCJP_IBMKANJI_TRAILER_SHIFT_CODE yes setenv  EUCJP_IBMKANJI_LAST_STATE
       ebcdic_mode   setenv   EUCJP_IBMKANJI_INITIAL_SHIFT_CODE	  yes	setenv
       EUCJP_IBMKANJI_TRAILER_SHIFT_CODE yes setenv  EUCJP_IBMKANJI_LAST_STATE
       ebcdic_mode setenv EUCJP_IBMKANJI_PROFILE .eucjp_ibmkanji_profile

   Directory Search Path
       When  you  specify  a  file name without a directory, the iconv utility
       searches the following directories and uses the first file found:  Cur‐
       rent directory Home directory The iconv/data subdirectory of the direc‐
       tory    specified    by	  the	  environment	  variable     LOCPATH
       /usr/lib/nls/loc/iconv/data /usr/i18n/lib/nls/loc/iconv/data

       If  you	specify	 a  relative  directory	 path  for a file, the utility
       searches these same directories in the same order and  uses  the	 first
       file found.

   Profile File
       Entry lines in the profile file adhere to the following format:

       entry_name	 string_value

       The entry_name and string_value fields are separated by spaces or tabs.
       Do not append a colon (:) after entry_name. The file can	 also  include
       blank lines and comment entries, which begin with the # character.

       Following  are  the  entry_name values for different conversion control
       items:

       ────────────────────────────────────────────────────────────
       Conversion Control Item		 entry_name
       ────────────────────────────────────────────────────────────
       UDC mapping table		 udc_mapping_table
       EBCDIC-ISO mapping table		 ebcdic_mapping_table
       K shift code			 k_shift_code
       A shift code			 a_shift_code
       Initial state			 initial_state
       Processing undefined characters
       in Kanji mode			 kanji_except_proc
       Processing undefined characters
       in EBCDIC mode			 ebcdic_except_proc
       Padding character
       in Kanji mode			 padding_2byte_char
       Padding character
       in EBCDIC mode			 padding_1byte_char
       Output initial
       shift code			 output_initial_shift_code
       Output last
       shift code			 output_trailer_shift_code
       Last state			 last_state
       ────────────────────────────────────────────────────────────

       Following is a sample profile for converting from Japanese EUC  to  IBM
       Kanji.

       #   #	sample	 profile   for	 eucJP_ibmkanji	  #  udc_mapping_table
       eucjp_ibmkanji_udc.tbl	 ebcdic_mapping_table	       kana_ebcdic.tbl
       k_shift_code		   0x0e		# ebcdic -> kanji a_shift_code
       0x0f	    # kanji -> ebcdic  initial_state		   ebcdic_mode
       kanji_except_proc	   replace  ebcdic_except_proc	       replace
       padding_2byte_char	  0x44e9       # kanji mode padding_1byte_char
       0x40	      #	  ebcdic   mode	 output_initial_shift_code   yes  out‐
       put_trailer_shift_code  yes last_state		      ebcdic_mode

       The default file names for the profile are as follows;

       ───────────────────────────────────────────────────────────
       Code Conversion		      Default Profile Name
       ───────────────────────────────────────────────────────────

       IBM Kanji to DEC Kanji	      .ibmkanji_deckanji_profile
       IBM Kanji to Super DEC Kanji   .ibmkanji_sdeckanji_profile
       IBM Kanji to Shift JIS	      .ibmkanji_sjis_profile
       IBM Kanji to Japanese EUC      .ibmkanji_eucjp_profile

       DEC Kanji to IBM Kanji	      .deckanji_ibmkanji_profile
       Super DEC Kanji to IBM Kanji   .sdeckanji_ibmkanji_profile
       Shift JIS to IBM Kanji	      .sjis_ibmkanji_profile
       Japanese EUC to IBM Kanji      .eucjp_ibmkanji_profile
       ───────────────────────────────────────────────────────────

       By default, the iconv utility checks the	 directory  search  path  men‐
       tioned  in  the "Directory Search Path" section and uses the first pro‐
       file it finds. However, you can also specify an arbitrary file path for
       your  profile  instead  of  the default names by defining the following
       environment variables:

       ─────────────────────────────────────────────────────────────────
       Code Conversion		      Profile Path Environment Variable
       ─────────────────────────────────────────────────────────────────
       IBM Kanji to DEC Kanji	      IBMKANJI_DECKANJI_PROFILE
       IBM Kanji to Super DEC Kanji   IBMKANJI_SDECKANJI_PROFILE
       IBM Kanji to Shift JIS	      IBMKANJI_SJIS_PROFILE
       IBM Kanji to Japanese EUC      IBMKANJI_EUCJP_PROFILE

       DEC Kanji to IBM Kanji	      DECKANJI_IBMKANJI_PROFILE
       Super DEC Kanji to IBM Kanji   SDECKANJI_IBMKANJI_PROFILE
       Shift JIS to IBM Kanji	      SJIS_IBMKANJI_PROFILE
       Japanese EUC to IBM Kanji      EUCJP_IBMKANJI_PROFILE
       ─────────────────────────────────────────────────────────────────

   UDC Mapping Table
       Entries in a UDC mapping table adhere to the following format:

       fromcode	     tocode

       Each of these values is a two-byte hexadecimal number. In the  case  of
       Super  DEC  Kanji  and Japanese EUC, three-byte hexadecimal values that
       begin with SS3 (0x8f), such as 0x8fxxxx, are also valid.

       You can specify ranges of UDC from and to values in the same file entry
       by using a hyphen to separate the codes that start and end each range:

       start_fromcode-end_fromcode   start_tocode-end_tocode

       When  specifying	 entries  that include ranges of values, the number of
       codes in the from range must always equal the number of codes in the to
       range.  A  UDC  mapping	table can also include blank lines and comment
       lines, which begin with the # character. Following is an example	 of  a
       UDC mapping table:

       # ibmkanji	     eucJP

       0x6941-0x72fe	       0xf5a1-0xfefe		 #  udc	 0x7341-0x7cfe
       0x8ff5a1-0X8ffefe       # udc  0x7d41-0x7ffe	     0x8feea1-0X8ff0fe
       # udc

       The first entry in this file specifies a range of IBM Kanji values from
       0x6941 to 0x72fe that are mapped to Japanese EUC	 code  values  in  the
       range  0xf5a1 to 0xfefe. You can find additional sample UDC mapping ta‐
       ble files in the /usr/i18n/examples/iconv/data directory.

   EBCDIC-ISO Mapping Table
       Entries in an EBCDIC-ISO mapping table adhere to the following format:

       fromcode	      tocode

       Each code is a one-byte hexadecimal number. You can specify a range  of
       character codes as follows:

       start_fromcode-end_fromcode     start_tocode-end_tocode

       When using the range format, the number of hex values in the from range
       must be the same as the number of hex values in the to range.

       The EBCDIC-/ISO mapping table can also include blank lines and  comment
       entries, which begin with the # character.

       Following is an example of EBCDIC-ISO code mapping table:

       # EBCDIC		       Kana

       0x40			    0x20		 #	space	  0x4f
       0x21	       # '!' 0x7f		     0x22	     # '"'
	 .			 .
	 .			 .
	 .			 .  0xc1-0xc9		    0x41-0x49	     #
       'A' - 'I' 0xd1-0xd9		 0x4a-0x52	 # 'J' - 'R' 0xe2-0xe9
       0x53-0x5a       # 'S' - 'Z'
	 .			 .
	 .			 .
	 .			 .

       In this example, the first column of values are from codes and the sec‐
       ond  column  of values are to codes.  The first three value entry lines
       specify mapping for single characters, whereas  the  last  three	 value
       entry  lines  specify  mapping  for ranges of characters.  You can find
       additional    sample    EBCDIC-ISO    mapping	 tables	    in	   the
       /usr/i18n/lib/nls/loc/iconv/data directory.

NOTES
       This  reference page contains code conversion specifications that apply
       only to conversion between IBM Kanji  System  characters	 and  the  DEC
       Kanji,  Super DEC Kanji, Japanese EUC, and Shift JIS codesets. Refer to
       iconv_JEF(5) for code conversion	 specifications	 between  Fujitsu  JEF
       characters  and the DEC Kanji, Super DEC Kanji, Japanese EUC, and Shift
       JIS codesets. Refer to iconv_KEIS(5) for code conversion specifications
       between Hitachi KEIS characters and the DEC Kanji, Super DEC Kanji, Ja‐
       panese EUC, and Shift JIS codesets. Refer to iconv_intro(5) for	infor‐
       mation  about  conversion  between DEC Kanji, Super DEC Kanji, Japanese
       EUC, Shift JIS, and other Tru64 UNIX codesets.

SEE ALSO
       Commands: iconv(1)

       Functions: iconv(3), iconv_close(3), iconv_open(3)

       Others:	 deckanji(5),	 eucJP(5),    iconv_intro(5),	 iconv_JEF(5),
       iconv_KEIS(5), Japanese(5), sdeckanji(5), SJIS(5)

							     iconv_ibmkanji(5)
[top]

List of man pages available for DigitalUNIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome