hspell man page on YellowDog

Man page or keyword search:  
man Server   18644 pages
apropos Keyword Search (all sections)
Output format
YellowDog logo
[printable version]

hspell(3)			     Ivrix			     hspell(3)

NAME
       hspell - Hebrew spellchecker (C API)

SYNOPSIS
       #include <hspell.h>

       int hspell_init(struct dict_radix **dictp, int flags);

       void hspell_uninit(struct dict_radix *dictp);

       int  hspell_check_word(struct  dict_radix  *dict, const char *word, int
       *preflen);

       void  hspell_trycorrect(struct  dict_radix  *dict,  const  char	*word,
       struct corlist *cl);

       int corlist_init(struct corlist *cl);

       int corlist_free(struct corlist *cl);

       int corlist_n(struct corlist *cl);

       char *corlist_str(struct corlist *cl, int i);

       int hspell_is_canonic_gimatria(const char *word);

       typedef	int  hspell_word_split_callback_func(const  char  *word, const
       char *baseword, int preflen, int prefspec);

       int  hspell_enum_splits(struct  dict_radix  *dict,  const  char	*word,
       hspell_word_split_callback_func *enumf);

       void hspell_set_dictionary_path(const char *path);

       const char *hspell_get_dictionary_path(void);

DESCRIPTION
       This  manual  describes	the  C	API of the Hspell Hebrew spellchecker.
       Please refer to hspell(1)  for  a  fuller  description  of  the	Hspell
       project, its spelling standard, and how it works.

       The  hspell_init()  function  must  be  called  first to initialize the
       Hspell library. It sets up some global structures (see CAVEATS section)
       and  then  reads the necessary dictionary files (whose places are fixed
       when the library is built). The 'dictp' parameter is  a	pointer	 to  a
       struct  dict_radix* object, which is modified to point to a newly allo‐
       cated dictionary.  A typical hspell_init() call therefore looks like

	  struct dict_radix *dict;
	  hspell_init(&dict, flags);

       Note that the (struct dict_radix*) type is  an  opaque  pointer	-  the
       library user has no access to the separate fields in this structure.

       The  'flags'  parameter	can  contain a bitwise or'ing of several flags
       that modify Hspell's default behavior; Turning on  HSPELL_OPT_HE_SHEELA
       allows Hspell to recognize the interrogative He prefix (he ha-she'ela).
       HSPELL_OPT_DEFAULT is a synonym for turning on no special  flag,	 i.e.,
       it evaluates to 0.

       hspell_init() returns 0 on success, or negative numbers on errors. Cur‐
       rently, the only error is -1, meaning the dictionary files could not be
       read.

       The hspell_uninit() function undoes the effects of hspell_init(), free‐
       ing any memory that was allocated during initialization.

       The hspell_check_word() function checks whether a  certain  word	 is  a
       correct	Hebrew word (possibly with prefix particles attached in a syn‐
       tacticly-correct manner). 1 is returned if the word is correct, or 0 if
       it is incorrect.

       The  'word'  parameter should be a single Hebrew word, in the iso8859-8
       encoding, possibly containing the ASCII quote or	 double-quote  charac‐
       ters  (signifying the geresh and gershayim used in Hebrew for abbrevia‐
       tions, acronyms, and a few foreign sounds).  If	the  calling  programs
       works  with  other  encodings,  it  must	 convert the word to iso8859-8
       first. In particular cp1255 (the MS-Windows Hebrew encoding) extensions
       to iso8859-8 like niqqud characters, geresh or gershayim, are currently
       not recognized and must be removed  from	 the  word  prior  to  calling
       hspell_check_word().

       Into  the  'preflen'  parameter, the function writes back the number of
       characters it recognized as a prefix particle - the rest of the	'word'
       is  a  stand-alone word.	 Because Hebrew words typically can be read in
       several different ways, this feature (of getting just one  prefix  from
       one  possible  reading) is usually not very useful, and it is likely to
       be removed in a future version.

       The hspell_enum_splits() function provides a way to  get	 all  possible
       splitting  of  the  given 'word' into an optional prefix particle and a
       stand-alone word.  For each possible (and legal, as some	 words	cannot
       accept  certain	prefixes)  split,  a user-defined callback function is
       called. This callback function is given the whole word, the  length  of
       the  prefix,  the stand-alone word, and a bitfield which describes what
       types of words this prefix can get.  Note that in some  cases,  a  word
       beginning with the letter waw gets this waw doubled before a prefix, so
       sometimes strlen(word)!=strlen(baseword)+preflen.

       The hspell_trycorrect() tries to find a list  of	 possible  corrections
       for  an	incorrect word.	 Because in Hebrew the word density is high (a
       random string of letters, especially if short, has a  high  probability
       of  being  a  correct  word), this function attempts to try corrections
       based on the assumption of a spelling  error  (replacement  of  letters
       that  sound  alike, missing or spurious immot qri'a), not typo (slipped
       finger on the keyboard, etc.) - see also CAVEATS.

       hspell_trycorrect() returns the correction list	into  a	 structure  of
       type  struct  corlist.	This  structure must be first allocated with a
       call to corlist_init() and subsequently freed with corlist_free().  The
       corlist_n()  macro  returns  the	 number	 of words held in an allocated
       corlist, and corlist_str() returns the i'th word. Accordingly, here  is
       an example usage of hspell_trycorrect():

	  struct corlist cl;
	  printf ("Found misspelled word %s. Possible corrections:\n", w);
	  corlist_init (&cl);
	  hspell_trycorrect (dict, w, &cl);
	  for (i=0; i<corlist_n(&cl); i++) {
	      printf ("%s\n", corlist_str(&cl, i));
	  }

       The hspell_is_canonic_gimatria() function checks whether the given word
       is a canonic gimatria - i.e., the proper way to write in	 gimatria  the
       number  it represents. The caller might want to accept canonic gimatria
       as proper Hebrew words, even if hspell_check_word() previously reported
       such  word  to  be  a  non-existent word.  hspell_is_canonic_gimatria()
       returns the number represented as gimatria in 'word' if	it  is	indeed
       proper gimatria (in canonic form), or 0 otherwise.

       hspell_init()  normally reads the dictionary files from a path compiled
       into the library. This makes sense when the library's code and the dic‐
       tionaries  are  distributed together, but in some scenarios the library
       user might want to use the Hspell dictionaries that are already present
       on  the	system	in  an arbitrary path. The function hspell_set_dictio‐
       nary_path() can be used to set this path, and  should  be  used	before
       calling	hspell_init().	 The  given path is that of the word list, and
       other  input  files  have  that	 path	with   an   appended   prefix.
       hspell_get_dictionary_path()  can  be used to find the current path. On
       many	    installations,	    this	  defaults	    to
       "/usr/local/share/hspell/hebrew.wgz".

LINKING
       On most systems, the Hspell library is compiled to use the Zlib library
       for reading the compressed dictionaries. Therefore, a  program  linking
       with the Hspell library must also be linked with the Zlib library (usu‐
       ally, by adding "-lz" to the compilation line).

       Programs that use autoconf to search for	 the  Hspell  library,	should
       remember	 to  tell  AC_CHECK_LIB to also link with the -lz library when
       checking for -lhspell.


CAVEATS
       Before Hspell reaches maturity, the API described here is to be consid‐
       ered unstable, and may change considerably even between minor releases.
       Users are encouraged to	compare	 the  values  of  the  integer	macros
       HSPELL_VERSION_MAJOR  and HSPELL_VERSION_MINOR to those expected by the
       writer of the program. A third macro, HSPELL_VERSION_EXTRA  contains  a
       string  which  can  describe  subrelease modifications (e.g., beta ver‐
       sions).

       The current Hspell C API is very low-level, in the sense that it leaves
       the  user  to  implement many features that some users take for granted
       that a spell-checker should provide. For example it doesn't provide any
       facilities for a user-defined personal dictionary. It also has separate
       functions for checking valid Hebrew words and valid  gimatria,  and  no
       function	 to  do	 both. It is assumed that the caller - a bigger spell-
       checking library or word processor (for	example),  will	 already  have
       these  facilities.  If  not,  you  may  wish  to look at the sources of
       hspell(1) for an example implementation.

       Currently there is no concept  of  separate  Hspell  "contexts"	in  an
       application.  Some of the context is now global for the entire applica‐
       tion: currently, a single list of legal prefix-particles is  kept,  and
       the  dictionary	read  by  hspell_init() is always read from the global
       default place. This may be solved in a later version, e.g., by  switch‐
       ing to an API like:

	  context = hspell_new_context();
	  hspell_set_dictionary_path(context, "/some/path/hebrew.wgz");
	  hspell_init(context, flags);
	  ...
	  hspell_check_word(context, word, preflenp);

       Note that despite the global context mentioned above, after initializa‐
       tion all functions described here are thread-safe,  because  they  only
       read the dictionary data, not write to it.

       hspell_trycorrect()  is	not  as	 powerful  as it could have been, with
       typos or certain kinds of spelling mistakes not giving  useful  correc‐
       tion  suggestions. Along with more types of corrections, hspell_trycor‐
       rect() needs a better way to order the likelihood of  the  corrections,
       as  an  unordered  list	of  100 corrections would be just as useful as
       none.

       In some cases of errors	during	hspell_init(),	warning	 messages  are
       printed	to  the	 standard errors. This is a bad thing for a library to
       do.

       There are too many CAVEATS in this manual.

VERSION
       The version of hspell described by this manual page  is	1.0  (May  16,
       2006)

COPYRIGHT
       Copyright (C) 2000-2006, Nadav Har'El <nyh@math.technion.ac.il> and Dan
       Kenigsberg <danken@cs.technion.ac.il>.

       Hspell is free software, released under the GNU General Public  License
       (GPL).	Note  that not only the programs in the distribution, but also
       the dictionary files and the generated word lists, are  licensed	 under
       the GPL.	 There is no warranty of any kind.

       See the LICENSE file for more information and the exact license terms.

       The    latest	version	  of   this   software	 can   be   found   in
       http://www.ivrix.org.il/projects/spell-checker

SEE ALSO
       hspell(1)

Hspell 1.0			  16 May 2006			     hspell(3)
[top]

List of man pages available for YellowDog

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net