Any way to have a “verbose mode” or “debug mode” with sed

debugginggnused

Is there a way to make gnu sed be verbose about what is run and what is done ?
I'd like to have something like a "debug mode" so that I can see – for each line of input – the content of the hold space and pattern space before and after the script is run etc.

Best Answer

As fra-san mentioned, GNU sed introduced a --debug option which does pretty much what you’re looking for, in version 4.6; so e.g if you run:

printf '%s\n' one two  | sed --debug 'H;1h;$x;$s/\n/_/g'

the output is

SED PROGRAM:
  H
  1 h
  $ x
  $ s/\n/_/g
INPUT:   'STDIN' line 1
PATTERN: one
COMMAND: H
HOLD:    \none
COMMAND: 1 h
HOLD:    one
COMMAND: $ x
COMMAND: $ s/\n/_/g
END-OF-CYCLE:
one
INPUT:   'STDIN' line 2
PATTERN: two
COMMAND: H
HOLD:    one\ntwo
COMMAND: 1 h
COMMAND: $ x
PATTERN: one\ntwo
HOLD:    two
COMMAND: $ s/\n/_/g
MATCHED REGEX REGISTERS
  regex[0] = 3-4 '
'
PATTERN: one_two
END-OF-CYCLE:
one_two

I don’t know what distribution you use, but this version of sed (or a later one) is available in Debian 10, in Ubuntu 19.04, and derivatives; it will be available in Fedora 33.

OUTPUT

NAME_A  12,1
NAME_B  21,2

That addresses lines beginning with a letter, pulls in the next if there is one, and substitutes a tab character for the newline.

note that the s/\n/<tab>/ bit contains a literal tab character here, though some seds might also support the \t escape in its place

To handle a recursive situation you need to make it a little more robust, like this:

sed '$!N;/^[[:alpha:]].*\n[^[:alpha:]]/s/\n/    /;P;D' <<\DATA
NAME_A
NAME_B
12,1  
NAME_C
21,2
DATA

OUTPUT

NAME_A
NAME_B  12,1
NAME_C  21,2

That slides through a data set always one line ahead. If two ^[[:alpha:]] lines occur one after the other, it does not mistakenly replace the newline, as you can see.

Remove hyphenation with sed

Some kind of a monster) With perl it should be easier

cat file
ba bla bla hyphe-</page>
<page>nated bla bla bla
and the output should look like

bla bla bla</page>
<page>hyphenated bla bla bla

It's GNU sed (in some other sed-s -E option is used for extended regular expressions)

sed -nr '/[[:alpha:]]+-<\/[[:alpha:]]+>$/{
N
s!([[:alpha:]]+)-(</[[:alpha:]]+>)\n(<[[:alpha:]]+>)([[:alpha:]]+)!\2\n\3\1\4!}
p' file
ba bla bla </page>
<page>hyphenated bla bla bla
and the output should look like

bla bla bla</page>
<page>hyphenated bla bla bla

Best Answer

Related Solutions

Sed – Replacing Lines Containing a Pattern

OUTPUT

OUTPUT

Remove hyphenation with sed

Related Question