dictfmt man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

DICTFMT(1)							    DICTFMT(1)

NAME
       dictfmt - formats a DICT protocol dictionary database

SYNOPSIS
       dictfmt	-c5|-t|-e|-f|-h|-j|-p [options]	 basename
       dictfmt	-i|-I [options]

DESCRIPTION
       dictfmt takes a file, FILE, on stdin, and creates a dictionary database
       named basename.dict, that conforms to the DICT protocol.	 It also  cre‐
       ates  an	 index	file  named  basename.index.  By default, the index is
       sorted according to the C locale, and only alphanumeric characters  and
       spaces  are  used  in  sorting,	however	 this  may be changed with the
       --locale and --allchars options.	 ( basename is commonly chosen to cor‐
       respond to the basename of FILE , but this is not mandatory.)

       Unless  the  database is extremely small, it is highly recommended that
       basename.dict be	 compressed  with  /usr/bin/dictzip  to	 create	 base‐
       name.dict.dz.  (dictzip is included in the dictd source package.)

       FILE  may  be  in  any  of  the several formats described by the format
       options -c5, -t, -e, -f, -h, -j, -p, -i or -I.  Exactly	one  of	 these
       options must be given.

       dictfmt	prepends  several headers are to the .dict file.  The 00-data‐
       base-url header gives the value of the -u option as the URL of the site
       from  which  the original database was obtained.	 The 00-database-short
       header gives the value of the -s option as the short name of  the  dic‐
       tionary.	  (This	 "short	 name"	is  the	 identifying name given by the
       "dict- D" option.)  If the -u and/or -s options are omitted, these val‐
       ues  will  be  shown  as "unknown", which is undesirable for a publicly
       distributed database.

       The date of conversion (formatting) is given  in	 the  00-database-info
       header.	 All  text  in	the input file prior to the first headword (as
       defined by the appropriate  formatting  option)	is  appended  to  this
       header.	 All  text  in	the input file following a headword, up to the
       next headword, is copied unchanged to the .dict file.

FORMATTING OPTIONS
       -c5    FILE is formatted with headwords preceded by 5  or  more	under‐
	      score  characters (_) and a blank line.  All text until the next
	      headword is considered the definition.  Any leading `@'  charac‐
	      ters are stripped out, but the file is otherwise unchanged. This
	      option was written to format the CIA WORLD FACTBOOK 1995.

       -t     -c5, --without-info and --without-headword options are  implied.
	      Use  this	 option,  if an input database comes from dictunformat
	      utility.

       -e     FILE is in html  format,	with  the  headword  tagged  as	 bold.
	      (<B>headword - </B>)
	      This  option  was	 written to format EASTON'S 1897 BIBLE DICTIO‐
	      NARY.  A typical entry from Easton is:

	      <A NAME="T0000005">
	      <B>Abagtha - </B>
	      one of the seven eunuchs	in  Ahasuerus's	 court	(Esther	 1:10;
	      2:21).

	      This is converted to:
	      Abagtha
		 one  of  the seven eunuchs in Ahasuerus's court (Esther 1:10;
	      2:21).

	      The heading "<A NAME="T0000005"> is omitted,  and	 the  headword
	      `Abagtha' is indexed.

	      NOTE:  This option should be used with caution.  It removes sev‐
	      eral html tags (enough to format Easton properly), but not  all.
	      The  Makefile  that was originally written to format dict-easton
	      uses sed scripts to modify certain cross reference tags.	It may
	      be  necessary  to	 pipe  the input file through a sed script, or
	      hack the source of dictfmt in order  to  properly	 format	 other
	      html databases.

       -f     FILE  is formatted with the headwords starting in column 0, with
	      the definition indented at least one space (or tab character) on
	      subsequent  lines.  The third line starting in column 0 is taken
	      as the first headword , and the first two lines starting in col‐
	      umn  0 are treated as part of the 00-database-info header.  This
	      option was written to format the F.O.L.D.O.C.

       -h     FILE is formatted with the headwords starting in column 0,  fol‐
	      lowed  by	 a  comma,  with the definition continuing on the same
	      line.  All text  before  the  first  single  character  line  is
	      included	in  00-database-info  header,  and lines with only one
	      character are omitted from the .dict file.  The  first  headword
	      is  on  the line following the first single character line.  The
	      headword is indexed; the text of the file is not changed.	  This
	      option was written to format HITCHCOCK'S BIBLE NAMES DICTIONARY.

       -j     FILE  is formatted with headwords starting in col 0, enclosed in
	      colons, followed by the definition.  The colons surrounding  the
	      headword are removed, and the headword is indexed.  Lines begin‐
	      ning with '*', '=', or '-' are also removed.   All  text	before
	      the  first headword is included in the headers.  This option was
	      written to format the JARGON FILE.
	      NOTE: Some recent versions of the JARGON FILE had	 three	blanks
	      inserted before the first colon at each headword.	 These must be
	      removed before processing with dictfmt.  (sed scripts have  been
	      used  for this purpose. ed, awk, or perl scripts are also possi‐
	      ble.)

       -p     FILE is formatted with `%h' in column 0, followed	 by  a	blank,
	      followed by the headword, optionally followed by a line contain‐
	      ing `%d' in column 0.  The definition starts  on	the  following
	      line.   The  first  line	beginning ´%h´ and any lines beginning
	      '%d' are stripped from the .dict file, and  '%h  '  is  stripped
	      from  in front of the headword.  All text before the first head‐
	      word is included in the headers.	The second line beginning '%h'
	      is taken as the first headword.  This option was written to for‐
	      mat Jay Kominek's elements database.

       -i -I  These two	 options  are  different  from	all  other  formatting
	      options.	 They  are  intended  to  resort  (according  to dictd
	      requirement) an .index file given on stdin.  That is .dict  file
	      is  not  generated  at  all.  Only resorting is made.  Three- or
	      four-column .index like input is expected.  -i  expects  decimal
	      offset and length, while -I expects them in base64 format.

OPTIONS
       -u url Specifies	 the  URL  of the site from which the raw database was
	      obtained.	 If this option is specified, 00-database-url headword
	      and appropriate definition will be ignored.

       -s name
	      Specifies the name and, optionally, the version and date, of the
	      database.	 (If this contains spaces, it  must  be	 quoted.)   If
	      this  option is specified, 00-database-short headword and appro‐
	      priate definition will be ignored.

       -L     display license and copyright information

       -V     display version information

       -D     output debugging information

       --help display a help message

       --locale locale
	      Specifies the locale used for sorting.  If no locale  is	speci‐
	      fied,  the  "C"  locale is used. For using UTF-8 mode, --utf8 is
	      needed.

       --8bit generates database in 8-bit mode, see --locale option also.
	      Note: This option is deprecated.	 Use  it  for  creating	 8-bit
	      (non-UTF8)  dictionaries only.  In order to create UTF-8 dictio‐
	      nary, use --utf8 option instead.

       --utf8 If specified, UTF-8 database is created.

       --allchars
	      Specifies that all characters should be used for the search,  by
	      default  only  alphabetic, numeric characters and spaces are put
	      to .index file and therefore are used  in	 search.  Creates  the
	      special entry 00-database-allchars.

       --case-sensitive
	      makes  the  search  case	sensitive.   Creates the special entry
	      00-database-case-sensitive.

       --headword-separator sep
	      sets the headword separator, which allows several words to  have
	      the same definition.  For example, if ´--headword-separator %%%'
	      is given, and the	 input	file  contains	´autumn%%%fall',  both
	      'autumn' and 'fall' will be indexed as  headwords, with the same
	      definition.

       --index-data-separator sep
	      sets the index/data separator, which allows to set the first and
	      fourth  columns  of .index file independently. That is the first
	      column can be treated as an index column (where the  MATCH  com‐
	      mand  searches)  and the fourth column as a result column (where
	      the MATCH gets things to be returned), and they (1-st  and  4-th
	      columns)	are completely independant of each other.  The default
	      value for this separator is ASCII symbol " \034".

       --break-headwords
	      multiple headwords will be written  on  separate	lines  in  the
	      .dict file.  For use with '--headword-separator.

       --index-keep-orig
	      When  --utf-8  is	 specified  headwords  are lowercased and non-
	      alphanumeric characters are removed from	it  before  saving  to
	      .index file in order to simplify the search.  When --index-keep-
	      orig option is used fourth column is created (if	necessary)  in
	      .index file, and contains an original headword which is returned
	      by MATCH command.	 This option may be useful to prevent convert‐
	      ing  "  AT&T"  to " ATT" or to keep proper nouns with uppercased
	      first letter.

       --without-headword
	      headwords will not be included in .dict file

       --without-header
	      header will not be copied to DB info entry

       --without-url
	      URL will not be copied to DB info entry

       --without-time
	      time of creation will not be copied to DB info entry

       --without-ver
	      By default dictfmt creates a special entry  00-database-dictfmt-
	      X.Y.Z  that  contains  (in .dict file) dictfmt version in format
	      dictfmt-X.Y.Z. This option suppresses this.

       --without-info
	      DB info entry will not  be  created.   This  may	be  useful  if
	      00-database-info	headword  is expected from stdin (dictunformat
	      outputs it).

       --columns columns
	      By default dictfmt wraps strings read from stdin to 72  columns.
	      This  option changes this default. If it is set to zero or nega‐
	      tive value, wrapping is off.

       --default-strategy strategy
	      Sets the default search strategy for the database.  It  will  be
	      used   instead   of   strategy   '.'.   Special  entry  00-data‐
	      base-default-strategy is created for this purpose.  This	option
	      may  be  useful, for example, for dictionaries containing mainly
	      phrases but the single words.  In any case, use this  option  if
	      you are absolutely sure what you are doing.

       --mime-header mime_header
	      When client sends OPTION MIME command to the dictd , definitions
	      found in this database  are  prepended  by  the  specified  MIME
	      header. Creates the special entry 00-database-mime-header.

CREDITS
       dictfmt	was  written  by  Rik  Faith (faith@cs.unc.edu) as part of the
       dict-misc package.  dictfmt is distributed under the terms of  the  GNU
       General	Public	License.  If you need to distribute under other terms,
       write to the author.

AUTHOR
       This   manual   page   was    written	by    Robert	D.    Hilliard
       <hilliard@debian.org> .

SEE ALSO
       dict(1),	 dictd(8),  dictzip(1),	 dictunformat(1), http://www.dict.org,
       RFC 2229

			       25 December 2000			    DICTFMT(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net