agrep man page on IRIX

Man page or keyword search:  
man Server   31559 pages
apropos Keyword Search (all sections)
Output format
IRIX logo
[printable version]



     AGREP(l)	       UNIX System V (Jan 17, 1992)	      AGREP(l)

     NAME
	  agrep - search a file for a string or regular expression,
	  with approximate matching capabilities

     SYNOPSIS
	  agrep [ -#cdehiklnpstvwxBDGIS ] pattern [ -f patternfile ] [
	  filename... ]

     DESCRIPTION
	  agrep searches the input filenames (standard input is the
	  default, but see a warning under LIMITATIONS) for records
	  containing strings which either exactly or approximately
	  match a pattern. A record is by default a line, but it can
	  be defined differently using the -d option (see below).
	  Normally, each record found is copied to the standard
	  output.  Approximate matching allows finding records that
	  contain the pattern with several errors including
	  substitutions, insertions, and deletions.  For example,
	  Massechusets matches Massachusetts with two errors (one
	  substitution and one insertion).  Running agrep -2
	  Massechusets foo outputs all lines in foo containing any
	  string with at most 2 errors from Massechusets.

	  agrep supports many kinds of queries including arbitrary
	  wild cards, sets of patterns, and in general, regular
	  expressions.	See PATTERNS below.  It supports most of the
	  options supported by the grep family plus several more (but
	  it is not 100% compatible with grep).	 For more information
	  on the algorithms used by agrep see Wu and Manber, "Fast
	  Text Searching With Errors," Technical report #91-11,
	  Department of Computer Science, University of Arizona, June
	  1991 (available by anonymous ftp from cs.arizona.edu in
	  agrep/agrep.ps.1), and Wu and Manber, "Agrep -- A Fast
	  Approximate Pattern Searching Tool", To appear in USENIX
	  Conference 1992 January (available by anonymous ftp from
	  cs.arizona.edu in agrep/agrep.ps.2).

	  As with the rest of the grep family, the characters `$',
	  `^', `*', `[', `]', `^', `|', `(', `)', `!', and `\' can
	  cause unexpected results when included in the pattern, as
	  these characters are also meaningful to the shell.  To avoid
	  these problems, one should always enclose the entire pattern
	  argument in single quotes, i.e., 'pattern'.  Do not use
	  double quotes (").

	  When agrep is applied to more than one input file, the name
	  of the file is displayed preceding each line which matches
	  the pattern.	The filename is not displayed when processing
	  a single file, so if you actually want the filename to
	  appear, use /dev/null as a second file in the list.

     OPTIONS

     Page 1					     (printed 11/3/95)

     AGREP(l)	       UNIX System V (Jan 17, 1992)	      AGREP(l)

	  -#   # is a non-negative integer (at most 8) specifying the
	       maximum number of errors permitted in finding the
	       approximate matches (defaults to zero).	Generally,
	       each insertion, deletion, or substitution counts as one
	       error.  It is possible to adjust the relative cost of
	       insertions, deletions and substitutions (see -I -D and
	       -S options).

	  -c   Display only the count of matching records.

	  -d 'delim'
	       Define delim to be the separator between two records.
	       The default value is '$', namely a record is by default
	       a line.	delim can be a string of size at most 8 (with
	       possible use of ^ and $), but not a regular expression.
	       Text between two delim's, before the first delim, and
	       after the last delim is considered as one record.  For
	       example, -d '$$' defines paragraphs as records and -d
	       '^From ' defines mail messages as records.  agrep
	       matches each record separately.	This option does not
	       currently work with regular expressions.

	  -e pattern
	       Same as a simple pattern argument, but useful when the
	       pattern begins with a `-'.

	  -f patternfile
	       patternfile contains a set of (simple) patterns.	 The
	       output is all lines that match at least one of the
	       patterns in patternfile. Currently, the -f option works
	       only for exact match and for simple patterns (any meta
	       symbol is interpreted as a regular character); it is
	       compatible only with -c, -h, -i, -l, -s, -v, -w, and -x
	       options.	 see LIMITATIONS for size bounds.

	  -h   Do not display filenames.

	  -i   Case-insensitive search - e.g., "A" and "a" are
	       considered equivalent.

	  -k   No symbol in the pattern is treated as a meta
	       character. For example, agrep -k 'a(b|c)*d' foo will
	       find the occurrences of a(b|c)*d in foo whereas agrep
	       'a(b|c)*d' foo will find substrings in foo that match
	       the regular expression 'a(b|c)*d'.

	  -l   List only the files that contain a match.  This option
	       is useful for looking for files containing a certain
	       pattern.	 For example, " agrep -l 'wonderful'  * " will
	       list the names of those files in current directory that
	       contain the word 'wonderful'.

     Page 2					     (printed 11/3/95)

     AGREP(l)	       UNIX System V (Jan 17, 1992)	      AGREP(l)

	  -n   Each line that is printed is prefixed by its record
	       number in the file.

	  -p   Find records in the text that contain a supersequence
	       of the pattern.	For example,
		agrep -p DCS foo will match "Department of Computer
	       Science."

	  -s   Work silently, that is, display nothing except error
	       messages.  This is useful for checking the error
	       status.

	  -t   Output the record starting from the end of delim to
	       (and including) the next delim. This is useful for
	       cases where delim should come at the end of the record.

	  -v   Inverse mode - display only those records that do not
	       contain the pattern.

	  -w   Search for the pattern as a word - i.e., surrounded by
	       non-alphanumeric characters.  The non-alphanumeric must
	       surround the match;  they cannot be counted as errors.
	       For example, agrep -w -1 car will match cars, but not
	       characters.

	  -x   The pattern must match the whole line.

	  -y   Used with -B option. When -y is on, agrep will always
	       output the best matches without giving a prompt.

	  -B   Best match mode.	 When -B is specified and no exact
	       matches are found, agrep will continue to search until
	       the closest matches (i.e., the ones with minimum number
	       of errors) are found, at which point the following
	       message will be shown:  "the best match contains x
	       errors, there are y matches, output them? (y/n)" The
	       best match mode is not supported for standard input,
	       e.g., pipeline input.  When the -#, -c, or -l options
	       are specified, the -B option is ignored.	 In general,
	       -B may be slower than -#, but not by very much.

	  -Dk  Set the cost of a deletion to k (k is a positive
	       integer).  This option does not currently work with
	       regular expressions.

	  -G   Output the files that contain a match.

	  -Ik  Set the cost of an insertion to k (k is a positive
	       integer).  This option does not currently work with
	       regular expressions.

	  -Sk  Set the cost of a substitution to k (k is a positive

     Page 3					     (printed 11/3/95)

     AGREP(l)	       UNIX System V (Jan 17, 1992)	      AGREP(l)

	       integer).  This option does not currently work with
	       regular expressions.

     PATTERNS
	  agrep supports a large variety of patterns, including simple
	  strings, strings with classes of characters, sets of
	  strings, wild cards, and regular expressions.

	  Strings
	       any sequence of characters, including the special
	       symbols `^' for beginning of line and `$' for end of
	       line.  The special characters listed above ( `$', `^',
	       `*', `[', `^', `|', `(', `)', `!', and `\' ) should be
	       preceded by `\' if they are to be matched as regular
	       characters.  For example, \^abc\\ corresponds to the
	       string ^abc\, whereas ^abc corresponds to the string
	       abc at the beginning of a line.

	  Classes of characters
	       a list of characters inside [] (in order) corresponds
	       to any character from the list.	For example, [a-ho-z]
	       is any character between a and h or between o and z.
	       The symbol `^' inside [] complements the list.  For
	       example, [^i-n] denote any character in the character
	       set except character 'i' to 'n'.	 The symbol `^' thus
	       has two meanings, but this is consistent with egrep.
	       The symbol `.' (don't care) stands for any symbol
	       (except for the newline symbol).

	  Boolean operations
	       agrep supports an `and' operation `;' and an `or'
	       operation `,', but not a combination of both.  For
	       example, 'fast;network' searches for all records
	       containing both words.

	  Wild cards
	       The symbol '#' is used to denote a wild card.  #
	       matches zero or any number of arbitrary characters.
	       For example, ex#e matches example.  The symbol # is
	       equivalent to .* in egrep.  In fact, .* will work too,
	       because it is a valid regular expression (see below),
	       but unless this is part of an actual regular
	       expression, # will work faster.

	  Combination of exact and approximate matching
	       any pattern inside angle brackets <> must match the
	       text exactly even if the match is with errors.  For
	       example, <mathemat>ics matches mathematical with one
	       error (replacing the last s with an a), but
	       mathe<matics> does not match mathematical no matter how
	       many errors we allow.

     Page 4					     (printed 11/3/95)

     AGREP(l)	       UNIX System V (Jan 17, 1992)	      AGREP(l)

	  Regular expressions
	       The syntax of regular expressions in agrep is in
	       general the same as that for egrep.  The union
	       operation `|', Kleene closure `*', and parentheses ()
	       are all supported.  Currently '+' is not supported.
	       Regular expressions are currently limited to
	       approximately 30 characters (generally excluding meta
	       characters).  Some options (-d, -w, -f, -t, -x, -D, -I,
	       -S) do not currently work with regular expressions.
	       The maximal number of errors for regular expressions
	       that use '*' or '|' is 4.

     EXAMPLES
	  agrep -2 -c ABCDEFG foo
	       gives the number of lines in file foo that contain
	       ABCDEFG within two errors.

	  agrep -1 -D2 -S2 'ABCD#YZ' foo
	       outputs the lines containing ABCD followed, within
	       arbitrary distance, by YZ, with up to one additional
	       insertion (-D2 and -S2 make deletions and substitutions
	       too "expensive").

	  agrep -5 -p abcdefghij /usr/dict/words
	       outputs the list of all words containing at least 5 of
	       the first 10 letters of the alphabet in order.  (Try
	       it:  any list starting with academia and ending with
	       sacrilegious must mean something!)

	  agrep -1 'abc[0-9](de|fg)*[x-z]' foo
	       outputs the lines containing, within up to one error,
	       the string that starts with abc followed by one digit,
	       followed by zero or more repetitions of either de or
	       fg, followed by either x, y, or z.

	  agrep -d '^From ' 'breakdown;internet' mbox
	       outputs all mail messages (the pattern '^From '
	       separates mail messages in a mail file) that contain
	       keywords 'breakdown' and 'internet'.

	  agrep -d '$$' -1 '<word1> <word2>' foo
	       finds all paragraphs that contain word1 followed by
	       word2 with one error in place of the blank. In
	       particular, if word1 is the last word in a line and
	       word2 is the first word in the next line, then the
	       space will be substituted by a newline symbol and it
	       will match.  Thus, this is a way to overcome separation
	       by a newline.  Note that -d '$$' (or another delim
	       which spans more than one line) is necessary, because
	       otherwise agrep searches only one line at a time.

	  agrep '^agrep' <this manual>

     Page 5					     (printed 11/3/95)

     AGREP(l)	       UNIX System V (Jan 17, 1992)	      AGREP(l)

	       outputs all the examples of the use of agrep in this
	       man pages.

     SEE ALSO
	  ed(1), ex(1), grep(1V), sh(1), csh(1).

     BUGS/LIMITATIONS
	  Any bug reports or comments will be appreciated! Please mail
	  them to sw@cs.arizona.edu or udi@cs.arizona.edu

	  Regular expressions do not support the '+' operator (match 1
	  or more instances of the preceding token).  These can be
	  searched for by using this syntax in the pattern:

	       'pattern(pattern)*'

	  (search for strings containing one instance of the pattern,
	  followed by 0 or more instances of the pattern).

	  The following can cause an infinite loop:  agrep pattern * >
	  output_file.	If the number of matches is high, they may be
	  deposited in output_file before it is completely read
	  leading to more matches of the pattern within output_file
	  (the matches are against the whole directory).  It's not
	  clear whether this is a "bug" (grep will do the same), but
	  be warned.

	  The maximum size of the patternfile is limited to be 250Kb,
	  and the maximum number of patterns is limited to be 30,000.

	  Standard input is the default if no input file is given.
	  However, if standard input is keyed in directly (as opposed
	  to through a pipe, for example) agrep may not work for some
	  non-simple patterns.

	  There is no size limit for simple patterns.  More
	  complicated patterns are currently limited to approximately
	  30 characters.  Lines are limited to 1024 characters.
	  Records are limited to 48K, and may be truncated if they are
	  larger than that.  The limit of record length can be changed
	  by modifying the parameter Max_record in agrep.h.

     DIAGNOSTICS
	  Exit status is 0 if any matches are found, 1 if none, 2 for
	  syntax errors or inaccessible files.

     AUTHORS
	  Sun Wu and Udi Manber, Department of Computer Science,
	  University of Arizona, Tucson, AZ 85721.
	  {sw|udi}@cs.arizona.edu.

     Page 6					     (printed 11/3/95)

     AGREP(l)	       UNIX System V (Jan 17, 1992)	      AGREP(l)

     Page 7					     (printed 11/3/95)

[top]

List of man pages available for IRIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net