Patterns and file processing

awk

Let's say I have to perform these actions from an input file:

  • extract nth field from a line starting with a given pattern (in the exemple: 2nd field of the line starting with pattern 'name')

  • print the field content at the beginning of every following line, while the line does not start with the selected pattern

  • when a new line matching the pattern is found, repeat step 1 and 2

I'm currently doing this using Python, but it would be better using something light and fast from command line (like awk, for exemple).

Sample input

name    NAME_A
inf     field_A1
name    NAME_B 
inf field_B1
inf field_B2

Expected output:

name    NAME_A
NAME_A  inf field_A1
name    NAME_B 
NAME_B  inf field_B1
NAME_B  inf field_B2

Best Answer

This can be a way to do it. Note the format may vary depending on the field separators you indicate - those you can define with FS and OFS:

$ awk -v n=2 '/^name/ {a=$(n); print; next} {print a, $0}' file
name    NAME_A
NAME_A inf  field_A1
name    NAME_B 
NAME_B inf  field_B1
NAME_B inf  field_B2

Explanation

  • -v n=2 defines the field number to copy when the pattern is found.
  • /^name/ {a=$(n); print; next} if the line starts with the given pattern, store the given field and print the line.
  • {print a, $0} otherwise, print the current line with the stored value first.

You can generalize the pattern part into something like:

awk -v n=2 -v pat="name" '$1==pat {a=$(n); print; next} {print a, $0}' file
Related Question