code_page man page on Tru64

Man page or keyword search:  
man Server   12896 pages
apropos Keyword Search (all sections)
Output format
Tru64 logo
[printable version]

code_page(5)							  code_page(5)

NAME
       code_page,  cp437,  cp737,  cp775,  cp850,  cp852, cp855, cp857, cp860,
       cp861, cp862, cp863, cp865, cp866, cp869, cp874, cp932,	cp936,	cp949,
       cp950,  cp1250, cp1251, cp1252, cp1253, cp1254, cp1255, cp1256, cp1257,
       cp1258, dingbats, symbol - Coded character sets that are used on Micro‐
       soft Windows and NT systems

DESCRIPTION
       Code pages are coded character sets that are used on Microsoft Windows,
       Windows 95, and NT systems. Just as there are different UNIX  codesets,
       there  are different PC code pages, each supporting a particular set of
       character encodings.

       A Tru64 UNIX system supplies one	 locale,  en_US.cp850,	that  directly
       supports a PC code-page format (MS-DOS Latin 1). For all other locales,
       data in code-page format is supported only through codeset  converters.
       These  converters can be run directly by users or by software or appli‐
       cations that exchange data between PC and Tru64 UNIX systems. Fonts and
       other kinds of character support are available only for the native UNIX
       codeset to which a code page can be converted.  See  the	 i18n_intro(5)
       reference  page	for  introductory information on locales and codesets.
       See the iconv_intro(5) reference page for an  introduction  to  codeset
       conversion and the name format and location of codeset converters.

       The  following  table lists and describes the code pages that have con‐
       version support on a Tru64 UNIX system. An  asterisk  (*)  follows  the
       names  of  code	pages  that include support for the Euro currency sign
       (C=).

       ───────────────────────────────────────────────────────────────
       Code Page	    Description
       ───────────────────────────────────────────────────────────────
       cp437		    MS-DOS United States
       cp737		    Greek
       cp775		    Baltic languages (1)
       cp850		    MS-DOS Multilingual (Latin-1)
       cp852		    MS-DOS Slavic (Latin-2)
       cp855		    IBM Cyrillic
       cp857		    IBM Turkish
       cp860		    MS-DOS Portuguese
       cp861		    MS-DOS Icelandic
       cp862		    Hebrew
       cp863		    MS-DOS Canadian French
       cp865		    MS-DOS Nordic languages
       cp866		    MS-DOS Russian
       cp869		    IBM Modern Greek
       cp874 *		    MS-DOS Thai
       cp932		    Japanese
       cp936		    Chinese (People's Republic of China)
       cp949		    Korean
       cp950		    Chinese (Hong Kong)
       cp1250 *		    Windows Latin-2
       cp1251 *		    Windows Cyrillic
       cp1252 *		    Windows Latin-1
       cp1253 *		    Windows Greek
       cp1254 *		    Windows Turkish
       cp1255 *		    Windows Hebrew
       cp1256 *		    Windows Arabic
       cp1257 *		    Windows Baltic (1)
       cp1258 *		    Windows Vietnamese
       dingbats		    Microsoft dingbat characters

       symbol		    Microsoft miscellaneous symbol characters
       ───────────────────────────────────────────────────────────────

       (1) Baltic languages include Estonian, Latvian, and Lithuanian.

       (2) Latin-2 languages include Albanian, Croatian, Czech, Faeroese, Hun‐
       garian, Polish, Romanian, Latin Serbian, Slovak, and Slovenian.

       (3) Cyrillic languages include Byelorussian, Bulgarian, and Russian.

       In  all	cases,	a  code	 page  can be converted to and from the UCS-2,
       UCS-4, and UTF-8 codesets. In addition, some code  pages	 can  be  con‐
       verted  directly	 to  ISO  codesets  as	shown  in the following table,
       although some data loss may occur.

       ──────────────────────────────────────────
       Code Page   Can Be Converted Directly to:
       ──────────────────────────────────────────
       cp437	   ISO8859-1
       cp737	   ISO8859-7
       cp775	   ISO8859-4
       cp850	   ISO8859-1
       cp852	   ISO8859-2
       cp855	   ISO8859-5
       cp857	   ISO8859-9
       cp860	   ISO8859-1
       cp861	   ISO8859-1
       cp862	   ISO8859-8
       cp863	   ISO8859-1
       cp865	   ISO8859-1
       cp866	   ISO8859-5
       cp869	   ISO8859-7
       cp874	   TACTIS
       cp1252	   ISO8859-1, ISO8859-15
       ──────────────────────────────────────────

       See Unicode(5) for information about UCS-2, UCS-4, and UTF-8. Reference
       pages for UNIX implementations of the ISO codesets have the name format
       iso8859-number(5).

       For Traditional Chinese and Japanese, there are no  codeset  converters
       whose names include the name of a code page because identical character
       encoding is provided in existing UNIX codesets.	For  Traditional  Chi‐
       nese, character encoding in PC code-page format (cp950) is identical to
       that in the Big-5 (big5) codeset. For Japanese, character  encoding  in
       PC  code-page  format  (cp932)  is  identical  to that in the Shift JIS
       (SJIS) codeset. Therefore, the codeset converters whose	names  include
       big5  and  SJIS	can be used to convert data in and out of PC code-page
       format for the supported languages.

	       Caution for Conversion of Korean and Simplified Chinese

       Conversion of text that starts out in code-page format (cp949)  to  the
       DEC  Korean  (deckorean) codeset may result in loss of data. All of the
       Tru64 UNIX codeset equivalents for cp949 support all the Hanja and mis‐
       cellaneous  characters  also  supported by the code page. However, only
       the UCS-2, UCS-4, and UTF-8 codesets support the complete set of Hangul
       characters  supported  by  the  cp949 code page.	 The deckorean codeset
       supports only a subset of these Hangul characters. Therefore,  if  data
       is  converted  from  cp949 format to UCS-2, UCS-4, or UTF-8, no data is
       lost. However, if the data is then  converted  from  UCS-2,  UCS-4,  or
       UTF-8 to deckorean, the unsupported Hangul characters will be lost.

       The  DEC	 Hanzi (dechanzi) codeset uses the same encoding format as the
       PC code page used for Simplified Chinese (cp936) but does  not  support
       all  the characters supported by the code page.	Therefore, you can use
       converters with dechanzi in the converter name to convert text  to  and
       from cp936 format, but the operation may result in some loss of data.

SEE ALSO
       Commands: iconv(1)

       Functions: iconv(3), iconv_close(3), iconv_open(3)

       Others:	 i18n_intro(5),	 iconv_intro(5),  iso8859-1(5),	 iso8859-2(5),
       iso8859-4(5), iso8859-5(5), iso8859-7(5), iso8859-8(5),	iso8859-15(5),
       Unicode(5)

								  code_page(5)
[top]
                             _         _         _ 
                            | |       | |       | |     
                            | |       | |       | |     
                         __ | | __ __ | | __ __ | | __  
                         \ \| |/ / \ \| |/ / \ \| |/ /  
                          \ \ / /   \ \ / /   \ \ / /   
                           \   /     \   /     \   /    
                            \_/       \_/       \_/ 
More information is available in HTML format for server Tru64

List of man pages available for Tru64

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net