The short answer is that you can not use regular expressions to search the shell history. According to POSIX (the standard for Unix-like operating systems), you should be able to search using regular shell pattern matching (as used for filename globbing and with case
statements). This feature is referred to as non-incremental search but it currently does not seem to be correctly implemented in Bash.
POSIX specification
The POSIX specification for shell Command Line Editing (vi-mode) states that these search patterns should use regular shell pattern matching. While the ^
meta-character is used to match the start of a line, they are not regular expressions.
/pattern<newline>
Move backwards through the command history, searching for the specified
pattern, beginning with the previous command line. Patterns use the pattern
matching notation described in Pattern Matching
Notation
, except that the '^' character shall have special meaning when it appears
as the first character of pattern. In this case, the '^' is discarded and
the characters after the '^' shall be matched only at the beginning of a
line. Commands in the command history shall be treated as strings, not as
filenames.
Documented Bash implementation
Bash uses the GNU Readline library to provide its interactive line-editing and history searching capabilities. The official documentation for the Readline library focuses more on Emacs mode, but a short section in its manual, Readline vi Mode states that
While the Readline library does not have a full set of vi editing functions,
it does contain enough to allow simple editing of the line.
The Readline vi mode behaves as specified in the POSIX standard.
Actual Bash implementation
After a number of experiments on two different systems, I found that the non-incremental searching in Bash/Readline does not work as described in its official documentation. I found that the *
was treated as a literal asterisk rather than a pattern that matches multiple characters. Likewise, the ?
and [
are also treated as literal characters.
For comparison, I tried using Vi-mode in tcsh
and verified that it correctly implements history searching as specified in the POSIX standard.
I then downloaded and searched through the code for the Readline library and found its history searching functions use a simple substring search and don’t use any search pattern meta-characters – aside from the caret, ^
(see search.c from the git repository for the Readline library).
I presume the Bash/Readline developers have yet to implement this feature. I couldn’t find a bug-list but the CHANGES
files shows that they’ve been regularly fixing issues relating to Vi-mode.
Update: This feature was implemented in Readline 8.0 (released with Bash 5.0 in January 2019). As documented in its CHANGES:
New Features in Readline
a. Non-incremental vi-mode search (N
, n
) can search for a shell pattern, as Posix specifies (uses fnmatch(3)
if available).
[[ ... ]]
tokenisation clashes with regular expressions (more on that in my answer to your follow-up question) and \
is overloaded as a shell quoting operator and a regexp operator (with some interference between the two in bash), and even when there's no apparent reason for a clash, the behaviour can be surprising. Rules can be confusing.
Who can tell what these will do without trying it (on all possible input) with any given version of bash
?
[[ $a = a|b ]]
[[ $a =~ a|b ]]
[[ $a =~ a&b ]]
[[ $a =~ (a|b) ]]
[[ $a =~ ([)}]*) ]]
[[ $a =~ [/\(] ]]
[[ $a =~ \s+ ]]
[[ $a =~ ( ) ]]
[[ $a =~ [ ] ]]
[[ $a =~ ([ ]) ]]
You can't quote the regexps, because if you do, since bash 3.2 and if bash 3.1 compatibility has not been enabled, quoting the regexps removes the special meaning of RE operator. For instance,
[[ $a =~ 'a|b' ]]
Matches if $a
contains a litteral a|b
only.
Storing the regexp in a variable avoids all those problems and also makes the code compatible to ksh93
and zsh
(provided you limit yourself to POSIX EREs):
regexp='a|b'
[[ $a =~ $regexp ]] # $regexp should *not* be quoted.
There's no ambiguity in the parsing/tokenising of that shell command, and the regexp that is used is the one stored in the variable without any transformation.
Best Answer
bash
was initially designed in the late 80s as a partial clone ofksh
with some interactive features from csh/tcsh.The origins of globbing have to be found in those earlier shells which it builds upon.
ksh
itself is an extension of the Bourne shell. The Bourne shell itself (first released in 1979 in Unix V7) was a clean implementation from scratch, but it did not depart completely from the Thompson shell (the shell of V1 -> V6) and incorporated features from the Mashey shell.In particular, command arguments were still separated by blanks,
|
was now the new pipe operator but^
was still supported as an alternative (and also explains why you do[!a-z]
and not[^a-z]
),$1
was still the first argument to a script and backslash was still the escape character. So many of the regexp operators (^\|$
) have a special meaning of their own in the shell.The Thompson shell relied on an external utility for globbing. When
sh
found unquoted*
,[
or?
s in the command, it would run the command throughglob
.would end up running glob as:
and glob would end up running
rm
with the list of files matching that pattern.would run
glob
as:The
*
above has been quoted by setting the 8th bit on that character, preventingglob
from treating it as a wildcard.glob
would then remove that bit before callinggrep
.To do the equivalent with regexps, that would have been:
Or:
to exclude dot-files.
The need to escape the operators as they double as shell special characters, the fact that
.
, common in filenames is a regexp operator makes it not very appropriate to match filenames and complicated for a beginner. In most cases, all you need is wildcards that can replace either one (?
) or any number (*
) of characters.Now, different shells added different globbing operators. Nowadays, the ksh and zsh globs (and to some extent
bash -O extglob
which implements a subset of ksh globs) are functionally equivalent to regexps with a syntax that is less cumbersome to use with filenames and the current shell syntax. For instance, inzsh
(with extendedglob extension), you can do:if you want (unlikely) to match filenames that consist of sequences of
a
followed by.txt
. Easier thanecho (^a*\.txt$)
(here using braces as a way to isolate the regex operators from the shell operators which could have been one way shells could deal with it).For mpg files (case insensitive) whose basename is foo, bar or a decimal number from 1 to 20...
ksh93
now can also incorporate regexps (basic, extended, perl-like or "augmented") in its globs (though it's quite buggy) and even provides a tool to convert between glob and regexp (printf %R
,printf %P
):to match (non-hidden) txt files with Extended regular expressions, case-insensitively.