Vim documentation: pattern

main help file

*pattern.txt*   For Vim version 5.5.  Last change: 1999 Sep 17


		  VIM REFERENCE MANUAL    by Bram Moolenaar



Patterns and search commands				*pattern-searches*

1. Search commands		|search-commands|
2. The definition of a pattern	|search-pattern|

==============================================================================

1. Search commands					*search-commands*


							*/*
/{pattern}[/]<CR>	Search forward for the [count]'th occurrence of
			{pattern} (exclusive).

/{pattern}/{offset}<CR>	Search forward for the [count]'th occurrence of
			{pattern} and go |{offset}| lines up or down.
			(linewise).


							*/<CR>*
/<CR>			Search forward for the [count]'th latest used
			pattern |last-pattern| with latest used |{offset}|.

//{offset}<CR>		Search forward for the [count]'th latest used
			pattern |last-pattern| with new |{offset}|.  If
			{offset} is empty no offset is used.


							*?*
?{pattern}[?]<CR>	Search backward for the [count]'th previous
			occurrence of {pattern} (exclusive).

?{pattern}?{offset}<CR>	Search backward for the [count]'th previous
			occurrence of {pattern} and go |{offset}| lines up or
			down (linewise).


							*?<CR>*
?<CR>			Search backward for the [count]'th latest used
			pattern |last-pattern| with latest used |{offset}|.

??{offset}<CR>		Search backward for the [count]'th latest used
			pattern |last-pattern| with new |{offset}|.  If
			{offset} is empty no offset is used.


							*n*
n			Repeat the latest "/" or "?" [count] times.
			|last-pattern| {Vi: no count}


							*N*
N			Repeat the latest "/" or "?" [count] times in
			opposite direction. |last-pattern| {Vi: no count}


							*star*
*			Search forward for the [count]'th occurrence of the
			word nearest to the cursor.  The word used for the
			search is the first of:
				1. the keyword under the cursor |'iskeyword'|
				2. the first keyword after the cursor, in the
				   current line
				3. the non-blank word under the cursor
				4. the first non-blank word after the cursor,
				   in the current line
			Only whole keywords are searched for, like with the
			command "/\<keyword\>".  (exclusive)  {not in Vi}


							*#*
#			Same as "*", but search backward.  The pound sign
			(character 163) also works.  If the "#" key works as
			backspace, try using "stty erase <BS>" before starting
			Vim (<BS> is CTRL-H or a real backspace).  {not in Vi}


							*gstar*
g*			Like "*", but don't put "\<" and "\>" around the word.
			This makes the search also find matches that are not a
			whole word.  {not in Vi}


							*g#*
g#			Like "#", but don't put "\<" and "\>" around the word.
			This makes the search also find matches that are not a
			whole word.  {not in Vi}


							*gd*
gd			Goto local Declaration.  When the cursor is on a local
			variable, this command will jump to its declaration.
			First Vim searches for the start of the current
			function, just like "[[".  If it is not found the
			search stops in line 1.  If it is found, Vim goes back
			until a blank line is found.  From this position Vim
			searches for the keyword under the cursor, like with
			"*", but lines that look like a comment are ignored
			(see 'comments' option).
			Note that this is not guaranteed to work, Vim does not
			really check the syntax, it only searches for a match
			with the keyword.  If included files also need to be
			searched use the commands listed in |include-search|.
			{not in Vi}


							*gD*
gD			Goto global Declaration.  When the cursor is on a
			global variable that is defined in the file, this
			command will jump to its declaration.  This works just
			like "gd", except that the search for the keyword
			always starts in line 1.  {not in Vi}


							*CTRL-C*
CTRL-C			Interrupt current (search) command.  Use CTRL-Break on
			MS-DOS |dos-CTRL-Break|.
			In Normal mode, any pending command is aborted.


							*:noh* *:nohlsearch*
:noh[lsearch]		Stop the highlighting for the 'hlsearch' option.  It
			is automatically turned back on when using a search
			command, or setting the 'hlsearch' option.
			This command doesn't work in an autocommand, because
			the highlighting state is saved and restored when
			executing autocommands |autocmd-searchpat|.

While typing the search pattern the current match will be shown if the
'incsearch' option is on.  Remember that you still have to finish the search
command with <CR> to actually position the cursor at the displayed match.  Or
use <Esc> to abandon the search.

All matches for the last used search pattern will be highlighted if you set
the 'hlsearch' option.  This can be suspended with the |:nohlsearch| command.


					*search-offset* *{offset}*
These commands search for the specified pattern.  With "/" and "?" an
additional offset may be given.  There are two types of offsets: line offsets
and character offsets.  {the character offsets are not in Vi}

The offset gives the cursor position relative to the found match:
    [num]	[num] lines downwards, in column 1
    +[num]	[num] lines downwards, in column 1
    -[num]	[num] lines upwards, in column 1
    e[+num]	[num] characters to the right of the end of the match
    e[-num]	[num] characters to the left of the end of the match
    s[+num]	[num] characters to the right of the start of the match
    s[-num]	[num] characters to the left of the start of the match
    b[+num]	[num] characters to the right of the start (begin) of the match
    b[-num]	[num] characters to the left of the start (begin) of the match

If a '-' or '+' is given but [num] is omitted, a count of one will be used.
When including an offset with 'e', the search becomes inclusive (the
character the cursor lands on is included in operations).

Examples:

pattern			cursor position	
/test/+1		one line below "test", in column 1
/test/e			on the last t of "test"
/test/s+2		on the 's' of "test"
/test/b-3		three characters before "test"

If one of these commands is used after an operator, the characters between
the cursor position before and after the search is affected.  However, if a
line offset is given, the whole lines between the two cursor positions are
affected.


							*//;*
A very special offset is ';' followed by another search command.  For example:

  /test 1/;/test
  /test.*/+1;?ing?

The first one first finds the next occurrence of "test 1", and then the first
occurrence of "test" after that.

This is like executing two search commands after each other, except that:
- It can be used as a single motion command after an operator.
- The direction for a following "n" or "N" command comes from the first
  search command.
- When an error occurs the cursor is not moved at all.


							*last-pattern*
The last used pattern and offset are remembered.  They can be used to repeat
the search, possibly in another direction or with another count.  Note that
two patterns are remembered: One for 'normal' search commands and one for the
substitute command ":s".  Each time an empty pattern is given, the previously
used pattern is used.

The 'magic' option sticks with the last used pattern.  If you change 'magic',
this will not change how the last used pattern will be interpreted.
The 'ignorecase' option does not do this.  When 'ignorecase' is changed, it
will result in the pattern to match other text.

All matches for the last used search pattern will be highlighted if you set
the 'hlsearch' option.

In Vi the ":tag" command sets the last search pattern when the tag is searched
for.  In Vim this is not done, the previous search pattern is still remembered,
unless the 't' flag is present in 'cpoptions'.  The search pattern is always
put in the search history.

If the 'wrapscan' option is on (which is the default), searches wrap around
the end of the buffer.  If 'wrapscan' is not set, the backward search stops
at the beginning and the forward search stops at the end of the buffer.  If
'wrapscan' is set and the pattern was not found the error message "pattern
not found" is given, and the cursor will not be moved.  If 'wrapscan' is not
set the message becomes "search hit BOTTOM without match" when searching
forward, or "search hit TOP without match" when searching backward.  If
wrapscan is set and the search wraps around the end of the file the message
"search hit TOP, continuing at BOTTOM" or "search hit BOTTOM, continuing at
TOP" is given when searching backwards or forwards respectively.  This can be
switched off by setting the 's' flag in the 'shortmess' option.  The highlight
method 'w' is used for this message (default: standout).


							*search-range*
You cannot limit the search command "/" to a certain range of lines.  A trick
to do this anyway is to use the ":substitute" command with the 'c' flag.
Example:
  :.,300s/Pattern//gc
This command will search from the cursor position until line 300 for
"Pattern".  At the match, you will be asked to type a character.  Type 'q' to
stop at this match, type 'n' to find the next match.

The "*", "#", "g*" and "g#" commands look for a word near the cursor in this
order, the first one that is found is used:
- The keyword currently under the cursor.
- The first keyword to the right of the cursor, in the same line.
- The WORD currently under the cursor.
- The first WORD to the right of the cursor, in the same line.
The keyword may only contain letters and characters in 'iskeyword'.
The WORD may contain any non-blanks (<Tab>s and/or <Space>s).
Note that if you type with ten fingers, the characters are easy to remember:
the "#" is under your left hand middle finger (search to the left and up) and
the "*" is under your right hand middle finger (search to the right and down).

==============================================================================

2. The definition of a pattern		*search-pattern* *pattern* *[pattern]*

					*regular-expression* *regexp* *Pattern*

Patterns may contain special characters, depending on the setting of the
'magic' option.  It is recommended to always use the default setting, which is
'magic'.  This avoids portability problems.


							*/bar* */\bar*
1. A pattern is one or more branches, separated by "\|".  It matches anything
   that matches one of the branches.  Example: "foo\|beep" matches "foo" and
   "beep".  If more than one branch matches, the first one is used.

2. A branch is one or more pieces, concatenated.  It matches a match for the
   first, followed by a match for the second, etc.  Example: "foo[0-9]beep",
   first match "foo", then a digit and then "beep".

3. A piece is an atom, possibly followed by:
     'magic' 'nomagic'	

							*/star* */\star*
	*	\*	matches 0 or more of the preceding atom, as much as
			possible (maximum 32767)

							*/\+*
	\+	\+	matches 1 or more of the preceding atom, as much as
			possible (maximum 32767) {not in Vi}

							*/\=*
	\=	\=	matches 0 or 1 of the preceding atom, as much as
			possible {not in Vi}

							*/\{*
	\{n,m}  \{n,m}	matches n to m of the preceding atom, as much as
			possible {not in Vi}
	\{n}    \{n}	matches n of the preceding atom {not in Vi}
	\{n,}   \{n,}	matches at least n of the preceding atom, as much as
			possible {not in Vi}
	\{,m}   \{,m}	matches 0 to m of the preceding atom, as much as
			possible {not in Vi}
	\{}     \{}	matches 0 or more of the preceding atom, as much as
			possible (same as *) {not in Vi}

							*/\{-*
	\{-n,m}  \{-n,m} matches n to m of the preceding atom, as few as
			possible {not in Vi}
	\{-n}    \{-n}	matches n of the preceding atom {not in Vi}
	\{-n,}   \{-n,}	matches at least n of the preceding atom, as few as
			possible {not in Vi}
	\{-,m}   \{-,m}	matches 0 to m of the preceding atom, as few as
			possible {not in Vi}
	\{-}     \{-}	matches 0 or more of the preceding atom, as few as
			possible {not in Vi}

		(n and m are decimal numbers between 1 and 32767)

		If a "-" appears immediately after the "{", then a shortest
		match first algorithm is used (see example below).  In
		particular, "\{-}" is the same as "*" but uses the shortest
		match first algorithm.  BUT: A match that starts earlier is
		preferred over a shorter match: "a\{-}b" matches "aaab" in
		"xaaab".

    Examples:
       .*	.\*	matches anything, also empty string
       ^.\+$	^.\+$	matches any non-empty line
       foo\=	foo\=	matches "fo" and "foo"
       ab\{2,3}c	matches "abbc" or "abbbc"
       a\{5}		matches "aaaaa".
       ab\{2,}c		matches "abbc", "abbbc", "abbbbc", etc
       ab\{,3}c		matches "ac", "abc", "abbc" or "abbbc".
       a[bc]\{3}d	matches "abbbd", "abbcd", "acbcd", "acccd", etc.
       a\(bc\)\{1,2}d	matches "abcd" or "abcbcd"
       a[bc]\{-}[cd]	matches "abc" in "abcd"
       a[bc]*[cd]	matches "abcd" in "abcd"


4. An atom can be:
      magic   nomagic	

	^	^	at beginning of pattern or after "\|" or	*/^*
			"\(", matches start of line; at other
			positions, matches literal '^'

	\^	\^	at any position, matches literal '^'		*/\^*

	$	$	at end of pattern or in front of "\|" or	*/$*
			"\)", matches end-of-line <EOL>; at other
			positions, matches literal '$'

	\$	\$	at any position, matches literal '$'		*/\$*

	.	\.	matches any single character		  */.* */\.*

	\<	\<	matches the beginning of a word			*/\<*

	\>	\>	matches the end of a word			*/\>*

	Character classes {not in Vi}:

	\i	\i	identifier character (see 'isident' option)	*/\i*

	\I	\I	like "\i", but excluding digits			*/\I*

	\k	\k	keyword character (see 'iskeyword' option)	*/\k*

	\K	\K	like "\k", but excluding digits			*/\K*

	\f	\f	file name character (see 'isfname' option)	*/\f*

	\F	\F	like "\f", but excluding digits			*/\F*

	\p	\p	printable character (see 'isprint' option)	*/\p*

	\P	\P	like "\p", but excluding digits			*/\P*


						*whitespace* *white-space*

	\s	\s	whitespace character: <Space> and <Tab>		*/\s*

	\S	\S	non-whitespace character; opposite of \s	*/\S*

	\d	\d	digit:				[0-9]		*/\d*

	\D	\D	non-digit:			[^0-9]		*/\D*

	\x	\x	hex digit:			[0-9A-Fa-f]	*/\x*

	\X	\X	non-hex digit:			[^0-9A-Fa-f]	*/\X*

	\o	\o	octal digit:			[0-7]		*/\o*

	\O	\O	non-octal digit:		[^0-7]		*/\O*

	\w	\w	word character:			[0-9A-Za-z_]	*/\w*

	\W	\W	non-word character:		[^0-9A-Za-z_]	*/\W*

	\h	\h	head of word character:		[A-Za-z_]	*/\h*

	\H	\H	non-head of word character:	[^A-Za-z_]	*/\H*

	\a	\a	alphabetic character:		[A-Za-z]	*/\a*

	\A	\A	non-alphabetic character:	[^A-Za-z]	*/\A*

	\l	\l	lowercase character:		[a-z]		*/\l*

	\L	\L	non-lowercase character:	[^a-z]		*/\L*

	\u	\u	uppercase character:		[A-Z]		*/\u*

	\U	\U	non-uppercase character		[^A-Z]		*/\U*
			NOTE: using the atom is faster than the [] form
			NOTE: 'ignorecase' is not used by character classes
	(end of character classes)


	\e	\e	matches <Esc>					*/\e*

	\t	\t	matches <Tab>					*/\t*

	\r	\r	matches <CR>					*/\r*

	\b	\b	matches <BS>					*/\b*

	\n	\n	matches <NL> Not available yet!  Will be used	*/\n*
			for multi-line patterns

	~	\~	matches the last given substitute string    */~* */\~*

	\(\)	\(\)	A pattern enclosed by escaped parentheses      */\(\)*

			(e.g., "\(^a\)") matches that pattern		*/\)*

	\1      \1	Matches the same string that was matched by	*/\1*
			the first sub-expression in \( and \). {not in Vi}
			Example: "\([a-z]\).\1" matches "ata", "ehe", "tot",
			etc.

	\2      \2	Like "\1", but uses second sub-expression,	*/\2*
	   ...

	\9      \9	Like "\1", but uses ninth sub-expression.	*/\9*

	x	x	A single character, with no special meaning,
			matches itself

	\x	\x	A backslash followed by a single character, */\* */\\*
			with no special meaning, is reserved for future
			expansions


	[]	\[]	A range. This is a sequence of characters	*/[]*

			enclosed in "[]" or "\[]".  It matches any	*/\[]*
			single character from the sequence.  E.g., "[xyz]"
			matches any 'x', 'y' or 'z'.
			- If the sequence begins with "^", it matches any
			  single character NOT in the sequence: "[^xyz]"
			  matches anything but 'x', 'y' and 'z'.
			- If two characters in the sequence are separated by
			  '-', this is shorthand for the full list of ASCII
			  characters between them.  E.g., "[0-9]" matches any
			  decimal digit.
			- A character class expression is evaluated to the set
			  of characters belonging to that character class.  The
			  following character classes are supported:
			  Name		Contents 

*[:alnum:]*		  [:alnum:]     letters and digits

*[:alpha:]*		  [:alpha:]     letters

*[:ascii:]*		  [:ascii:]     ASCII characters

*[:blank:]*		  [:blank:]     space and tab characters

*[:cntrl:]*		  [:cntrl:]     control characters

*[:digit:]*		  [:digit:]     decimal digits

*[:graph:]*		  [:graph:]     printable characters excluding space

*[:lower:]*		  [:lower:]     lowercase letters

*[:print:]*		  [:print:]     printable characters including space

*[:punct:]*		  [:punct:]     punctuation characters

*[:space:]*		  [:space:]     whitespace characters

*[:upper:]*		  [:upper:]     uppercase letters

*[:xdigit:]*		  [:xdigit:]    hexadecimal digits

*[:return:]*		  [:return:]	the <CR> character

*[:tab:]*		  [:tab:]	the <Tab> character

*[:escape:]*		  [:escape:]	the <Esc> character

*[:backspace:]*		  [:backspace:]	the <BS> character
			  The brackets in character class expressions are
			  additional to the brackets delimiting a range.
			  For example, the following is plausible for a UNIX
			  filename: [-./[:alnum:]_~]\+
			  That is, a list of at least one character, each of
			  which is either '-', '.', '/', alphabetic, numeric,

			  '_' or '~'.				*/\]*
			- To include a literal ']', '^', '-' or '\' in the
			  sequence, put a backslash before it: "[xyz\]]",
			  "[\^xyz]", "[xy\-z]" and "[xyz\\]". (Note: POSIX
			  does not support the use of a backslash this way).
			  For ']' you can also make it the first character
			  (following a possible "^"):  "[]xyz]" or "[^]xyz]"
			  {not in Vi}.
			  For '-' you can also make it the first or last
			  character: "[-xyz]", "[^-xyz]" or "[xyz-]".
			  For '\' you can also let it be followed by any
			  character that's not in "^]-\etrb".  "[\xyz]"
			  matches '\', 'x', 'y' and 'z'.  It's better to use
			  "\\" though, future expansions may use other
			  characters after '\'.
			- The following translations are accepted when the 'l'
			  flag is not included in 'cpoptions' {not in Vi}:
				\e	<Esc>
				\t	<Tab>
				\r	<CR>
				\b	<BS>
			- Matching ranges can be slow, use one of the other
			  items above when possible.

If the 'ignorecase' option is on, the case of letters is ignored.

It is impossible to have a pattern that contains a line break (Sorry!).

Examples:
^beep(			Probably the start of the C function "beep".

[a-zA-Z]$		Any alphabetic character at the end of a line.

\<\I\i*		or
\<\h\w*
\<[a-zA-Z_][a-zA-Z0-9_]*
			An identifier (e.g., in a C program).

\(\.$\|\. \)		A period followed by <EOL> or a space.
			Note that "\(\. \|\.$\)" does not do the same,
			because '$' is not <EOL> in front of '\)'.
			This was done to remain Vi-compatible.

[.!?][])"']*\($\|[ ]\)	A search pattern that finds the end of a sentence,
			with almost the same definition as the ")" command.


Technical detail:				*NL-used-for-Nul*
<Nul> characters in the file are stored as <NL> in memory.  In the display
they are shown as "^@".  The translation is done when reading and writing
files.  To match a <Nul> with a search pattern you can just enter CTRL-@ or
"CTRL-V 000".  This is probably just what you expect.  Internally the
character is replaced with a <NL> in the search pattern.  What is unusual is
that typing CTRL-V CTRL-J also inserts a <NL>, thus also searches for a <Nul>
in the file.  {Vi cannot handle <Nul> characters in the file at all}


						*CR-used-for-NL*
When 'fileformat' is "mac", <NL> characters in the file are stored as <CR>
characters internally.  In the display they are shown as "^M".  Otherwise this
works similar to the usage of <NL> for a <Nul>.

top - main help file