Updated 18-Nov-2016 (since grep behavior is changed: grep with -P parameter now doesn't support ^
and $
anchors [on Ubuntu 16.04 with kernel v:4.4.0-21-generic])(wrong (non-)fix)
$ grep -Pzo "begin(.|\n)*\nend" file
begin
Some text goes here.
end
note: for other commands just replace the '^' & '$' anchors with new-line anchor '\n'
______________________________
With grep command:
grep -Pzo "^begin\$(.|\n)*^end$" file
If you want don't include the patterns "begin" and "end" in result, use grep with Lookbehind and Lookahead support.
grep -Pzo "(?<=^begin$\n)(.|\n)*(?=\n^end$)" file
Also you can use \K
notify instead of Lookbehind assertion.
grep -Pzo "^begin$\n\K(.|\n)*(?=\n^end$)" file
\K
option ignore everything before pattern matching and ignore pattern itself.
\n
used for avoid printing empty lines from output.
Or as @AvinashRaj suggests there are simple easy grep as following:
grep -Pzo "(?s)^begin$.*?^end$" file
grep -Pzo "^begin\$[\s\S]*?^end$" file
(?s)
tells grep to allow the dot to match newline characters.
[\s\S]
matches any character that is either whitespace or non-whitespace.
And their output without including "begin" and "end" is as following:
grep -Pzo "^begin$\n\K[\s\S]*?(?=\n^end$)" file # or grep -Pzo "(?<=^begin$\n)[\s\S]*?(?=\n^end$)"
grep -Pzo "(?s)(?<=^begin$\n).*?(?=\n^end$)" file
see the full test of all commands here (out of dated as grep behavior with -P parameter is changed)
Note:
^
point the beginning of a line and $
point the end of a line. these added to the around of "begin" and "end" to matching them if they are alone in a line.
In two commands I escaped $
because it also using for "Command Substitution"($(command)
) that allows the output of a command to replace the command name.
From man grep:
-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.
-P, --perl-regexp
Interpret PATTERN as a Perl compatible regular expression (PCRE)
-z, --null-data
Treat the input as a set of lines, each terminated by a zero byte (the ASCII
NUL character) instead of a newline. Like the -Z or --null option, this option
can be used with commands like sort -z to process arbitrary file names.
Try sed
with the following regex:
$ sed -i.bak 's_\(.*\),[[:blank:]]\([[:alpha:]]\+,[[:blank:]][[:alpha:]]\+[[:blank:]][[:digit:]]\+,[^,]\+$\)_\2 \1_' file.txt
Friday, Mar 13,2015 16:59:42 blah, blah, blah
Friday, Mar 13,2015 16:51:11 yadi, yadi, yada
Here we have used the sed
's group substitution method to get the desired output.
\(.*\)
will match upto blah, blah, blah
as we have ,[[:blank:]]
to match ,
after it.
\([[:alpha:]]\+,[[:blank:]][[:alpha:]]\+[[:blank:]][[:digit:]]\+,[^,]\+$\)
will match the remaining portion of the line (the portion we want to put at the start).
Then we have \2 \1
to put the second group at first and then then a space and then the first group.
The original file will be backed up as file.txt.bak
, if you don't want that use just -i
instead of -i.bak
.
**Although you will get the desired output, using Regex/sed will not be the optimum solution in this case.
EDIT: If you have a line like [Internet disconnected] Friday, Mar 13,2015 15:48:34
, try this:
$ sed -i.bak 's_\(.*[^,]\),*[[:blank:]]\([[:alpha:]]\+,[[:blank:]][[:alpha:]]\+[[:blank:]][[:digit:]]\+,[^,]\+$\)_\2 \1_' file.txt
Friday, Mar 13,2015 15:48:34 [Internet disconnected]
Friday, Mar 13,2015 16:59:42 blah, blah, blah
Friday, Mar 13,2015 16:51:11 yadi, yadi, yada
In the previous regex we had \(.*\),[[:blank:]]
(a comma and a whitespace after the first matching group), now to include the new line in the output we have made the first matching group \(.*[^,]\)
to ensure that it does not end with a comma and then we have matched ,*
i.e. one or more commas. So, the new sed
command will work for all mentioned cases.
Best Answer
The symbol for the beginning of a line is
^
. So, to print all lines whose first character is a(
, you would want to match^(
:grep
sed