charmap man page on SmartOS

Man page or keyword search:  
man Server   16655 pages
apropos Keyword Search (all sections)
Output format
SmartOS logo
[printable version]

CHARMAP(5)							    CHARMAP(5)

NAME
       charmap - character set description file

DESCRIPTION
       A character set description file or charmap defines characteristics for
       a coded character set. Other information about the coded character  set
       may  also  be  in  the  file.  Coded character set character values are
       defined using symbolic character names followed by  character  encoding
       values.

       The character set description file provides:

	   o	  The capability to describe character set attributes (such as
		  collation order or character classes) independent of charac‐
		  ter  set encoding, and using only the characters in the por‐
		  table character  set.	 This  makes  it  possible  to	create
		  generic  localedef(1)	 source	 files	for  all codesets that
		  share the portable character set.

	   o	  Standardized symbolic names for all characters in the porta‐
		  ble  character  set, making it possible to refer to any such
		  character regardless of encoding.

   Symbolic Names
       Each symbolic name  is included in the file and is mapped to  a	unique
       encoding	 value	(except	 for  those symbolic names that are shown with
       identical glyphs).  If the control characters commonly associated  with
       the  symbolic  names in the following table are supported by the imple‐
       mentation, the symbolic names and their corresponding  encoding	values
       are  included  in  the  file. Some of the encodings associated with the
       symbolic names in this table may be the same as characters in the  por‐
       table character set table.

       ┌──────────────────────────────────────────────┐
       │<ACK>	<DC2>	<ENQ>	<FS>	<IS4>	<SOH> │
       │<BEL>	<DC3>	<EOT>	<GS>	<LF>	<STX> │
       │<BS>	<DC4>	<ESC>	<HT>	<NAK>	<SUB> │
       │<CAN>	<DEL>	<ETB>	<IS1>	<RS>	<SYN> │
       │<CR>	<DLE>	<ETX>	<IS2>	<SI>	<US>  │
       │<DC1>	<EM>	<FF>	<IS3>	<SO>	<VT>  │
       └──────────────────────────────────────────────┘

   Declarations
       The  following declarations can precede the character definitions. Each
       must consist of the symbol shown in the	following  list,  starting  in
       column  1,  including the surrounding brackets, followed by one or more
       blank characters, followed by the value to be assigned to the symbol.

       <code_set_name>
			  The name of the coded character set  for  which  the
			  character set description file is defined.

       <mb_cur_max>
			  The  maximum number of bytes in a multi-byte charac‐
			  ter. This defaults to 1.

       <mb_cur_min>
			  An unsigned positive integer value that defines  the
			  minimum  number  of  bytes  in  a  character for the
			  encoded character set.

       <escape_char>
			  The escape character used to indicate that the char‐
			  acters  following  will  be interpreted in a special
			  way, as defined later in this section. This defaults
			  to  backslash	 ('\'),	 which	is the character glyph
			  used in all the following text and examples,	unless
			  otherwise noted.

       <comment_char>
			  The  character  that	when  placed  in column 1 of a
			  charmap line, is used to indicate that the  line  is
			  to  be  ignored. The default character is the number
			  sign (#).

   Format
       The character set mapping definitions will be all the lines immediately
       following  an identifier line containing the string CHARMAP starting in
       column 1, and preceding	a  trailer  line  containing  the  string  END
       CHARMAP	starting in column 1. Empty lines and lines containing a <com‐
       ment_char> in the first column will be ignored. Each  non-comment  line
       of  the	character set mapping definition, that is, between the CHARMAP
       and END CHARMAP lines of the file), must be in either of two forms:

	 "%s %s %s\n",<symbolic-name>,<encoding>,<comments>

       or

	 "%s...%s %s %s\n",<symbolic-name>,<symbolic-name>, <encoding>,\
		  <comments>

       In the first format, the line in the character set  mapping  definition
       defines	a single symbolic name and a corresponding encoding. A charac‐
       ter following an escape character is interpreted as itself;  for	 exam‐
       ple,  the  sequence "<\\\>>" represents the symbolic name "\>" enclosed
       between angle brackets.

       In the second format, the line in the character set mapping  definition
       defines	a  range of one or more symbolic names. In this form, the sym‐
       bolic names must consist of zero or more non-numeric characters,	  fol‐
       lowed  by  an integer formed by one or more decimal digits. The charac‐
       ters preceding the integer must be identical in the two symbolic names,
       and  the	 integer formed by the digits in the second symbolic name must
       be equal to or greater than the integer formed by  the  digits  in  the
       first  name.  This  is interpreted as a series of symbolic names formed
       from the common part and each of the integers between the first and the
       second  integer,	 inclusive. As an example, <j0101>...<j0104> is inter‐
       preted as the symbolic names <j0101>, <j0102>, <j0103>, and <j0104>, in
       that order.

       A  character  set  mapping  definition line must exist for all symbolic
       names and must define the coded character value that corresponds to the
       character  glyph	 indicated  in the table, or the coded character value
       that corresponds with the control character symbolic name. If the  con‐
       trol  characters	 commonly associated with the symbolic names  are sup‐
       ported by the implementation, the symbolic name and  the	 corresponding
       encoding value must be included in the file. Additional unique symbolic
       names may be included. A coded character value can  be  represented  by
       more than one symbolic name.

       The  encoding  part is expressed as one (for single-byte character val‐
       ues) or more concatenated decimal, octal or  hexadecimal	 constants  in
       the following formats:

	 "%cd%d",<escape_char>,<decimal byte value>

	 "%cx%x",<escape_char>,<hexadecimal byte value>

	 "%c%o",<escape_char>,<octal byte value>

   Decimal Constants
       Decimal	constants  must be represented by two or three decimal digits,
       preceded by the escape character and the lower-case letter d; for exam‐
       ple, \d05, \d97, or \d143. Hexadecimal constants must be represented by
       two hexadecimal digits, preceded by the escape character and the lower-
       case  letter  x; for example, \x05, \x61, or \x8f. Octal constants must
       be represented by two or three octal digits,  preceded  by  the	escape
       character; for example, \05, \141, or \217. In a portable charmap file,
       each constant must represent an 8-bit byte. Implementations  supporting
       other  byte  sizes  may allow constants to represent values larger than
       those that can be represented in 8-bit bytes, and to  allow  additional
       digits  in  constants.  When  constants are concatenated for multi-byte
       character values, they must be of the same  type,  and  interpreted  in
       byte  order  from  first to last with the least significant byte of the
       multi-byte character specified by the last constant.

   Ranges of Symbolic Names
       In lines defining ranges of symbolic names, the encoded	value  is  the
       value  for the first symbolic name in the range (the symbolic name pre‐
       ceding the ellipsis). Subsequent symbolic names defined	by  the	 range
       will  have  encoding  values  in increasing order. Bytes are treated as
       unsigned octets and carry is propagated between the bytes as  necessary
       to represent the range. However, because this causes a null byte in the
       second or subsequent bytes of a character, such	a  declaration	should
       not be specified. For example, the line

	 <j0101>...<j0104>     \d129\d254

       is interpreted as:

	 <j0101>		\d129\d254
	 <j0102>		\d129\d255
	 <j0103>		\d130\d00
	 <j0104>		\d130\d01

       The  expanded declaration of the symbol <j0103> in the above example is
       an invalid specification, because it contains a null byte in the second
       byte of a character.

       The comment is optional.

   Width Specification
       The following declarations can follow the character set mapping defini‐
       tions (after the "END CHARMAP" statement). Each consists of the keyword
       shown  in  the  following  list,	 starting in column 1, followed by the
       value(s) to be associated to the keyword, as defined below.

       WIDTH
			A non-negative integer value defining the column width
			for the printable character in the coded character set
			mapping definitions.  Coded  character	set  character
			values are defined using symbolic character names fol‐
			lowed by column width  values.	Defining  a  character
			with  more  than one WIDTH produces undefined results.
			The END WIDTH keyword is used to terminate  the	 WIDTH
			definitions.  Specifying  the width of a non-printable
			character in a WIDTH  declaration  produces  undefined
			results.

       WIDTH_DEFAULT
			A non-negative integer value defining the default col‐
			umn width for any printable character  not  listed  by
			one of the WIDTH keywords. If no WIDTH_DEFAULT keyword
			is included in	the  charmap,  the  default  character
			width is 1.

       Example:

       After  the  "END	 CHARMAP"  statement,  a syntax for a width definition
       would be:

	 WIDTH
	 <A>		 1
	 <B>		 1
	 <C>...<Z>	 1
	 ...
	 <fool>...<foon> 2
	 ...
	 END WIDTH

       In this example, the numerical code point  values  represented  by  the
       symbols	<A> and <B> are assigned a width of 1. The code point values <
       C> to <Z> inclusive, that is, <C>,  <D>,	 <E>,  and  so	on,  are  also
       assigned	 a  width  of  1.  Using <A>. . .<Z> would have required fewer
       lines, but the alternative was shown to	demonstrate  flexibility.  The
       keyword WIDTH_DEFAULT could have been added as appropriate.

SEE ALSO
       locale(1), localedef(1), nl_langinfo(3C), extensions(5), locale(5)

				  Dec 1, 2003			    CHARMAP(5)
[top]

List of man pages available for SmartOS

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net