regcmp(3)regcmp(3)NAME
regcmp, regex - Compile and execute regular expression
SYNOPSIS
#include <libgen.h>
char *regcmp(
const char *string1,
... /*,
(char *)0 */ ); char *regex(
const char *re,
const char *subject,
... );
LIBRARY
Standard C Library (libc)
STANDARDS
Interfaces documented on this reference page conform to industry stan‐
dards as follows:
regcmp(), regex(): XPG4-UNIX
Refer to the standards(5) reference page for more information about
industry standards and associated tags.
PARAMETERS
Points to the string that is to be matched or converted. Points to a
compiled regular expression string. Points to the string that is to be
matched against re.
DESCRIPTION
The regcmp() function compiles a regular expression consisting of the
concatenated arguments and returns a pointer to the compiled form. The
end of arguments is indicated by a null pointer. The malloc() function
is used to create space for the compiled form. It is the responsibility
of the process to free unneeded space so allocated. A null pointer
returned from regcmp() indicates an invalid argument.
The regex() function executes a compiled pattern against the subject
string. Additional arguments of type char must be passed to receive
matched subexpressions back. A global character pointer, __loc1, points
to the first matched character in the subject string.
The regcmp() and regex() functions support the simple regular expres‐
sions which are defined in the grep(1) reference page, but the syntax
and semantics are slightly different. The following are the valid sym‐
bols and their associated meanings: The left and right bracket, aster‐
isk, period, and circumflex symbols retain their meanings as defined in
the grep(1) reference page. A dollar sign matches the end of the
string; \n matches a new line. Used within brackets, the hyphen signi‐
fies an ASCII character range. For example [a-z] is equivalent to
[abcd...xyz]. The - (hyphen) can represent itself only if used as the
first or last character. For example, the character class expression
[]-] matches the characters ] (right bracket) and - (hyphen). A regu‐
lar expression followed by a + (plus sign) means one or more times. For
example, [0-9]+ is equivalent to [0-9][0-9]*. Integer values enclosed
in {} braces indicate the number of times the preceding regular expres‐
sion can be applied. The value m is the minimum number and u is a num‐
ber, less than 256, which is the maximum. The syntax {m} indicates the
exact number of times the regular expression can be applied. The syntax
{m,} is analogous to {m,infinity}. The + (plus sign) and * (asterisk)
operations are equivalent to {1,} and {0,}, respectively. The value of
the enclosed regular expression is returned. The value is stored in
the (n+1)th argument following the subject argument. A maximum of ten
enclosed regular expressions are allowed. The regex() function makes
its assignments unconditionally. Parentheses are used for grouping. An
operator, such as *, +, or {}, can work on a single character or a reg‐
ular expression enclosed in parentheses. For example, (a*(cb+)*)$0.
Since all of the symbols defined above are special characters, they
must be escaped to be used as themselves.
NOTES
The regcmp() and regex() interfaces are scheduled to be withdrawn from
a future version of the X/Open CAE Specification.
These interfaces are obsolete; they are guaranteed to function properly
only in the C/POSIX locale and so should be avoided. Use the POSIX reg‐
comp() interface instead of regcmp() and regex().
RETURN VALUES
Upon successful completion, the regcmp() function returns a pointer to
the compiled regular expression. Otherwise, a null pointer is returned
and errno may be set to indicate the error.
Upon successful completion, the regex() function returns a pointer to
the next unmatched character in the subject string. Otherwise, a null
pointer is returned.
SEE ALSO
Commands: grep(1)
Functions: malloc(3), regcomp(3)
Standards: standards(5)regcmp(3)