Text Processing – Adding a String to Every Column Except the First Using awk or sed

awksedtext processing

I have a file with multiple lines/rows, and each line contains a variable amount of columns:

Name1 String111 String112
Name2 String121 String122 String123
Name3 String131 String132 String133 String134

And so on and so forth (no pattern as to what line has how many entries).
I would like to add the name in the first column to the beginning of every column in that line/row such that I end up with:

Name1 Name1String111 Name1String112
Name2 Name2String121 Name2String122 Name2String123
Name3 Name3String131 Name3String132 Name3String133 Name3String134

We can start it simple and get more complicated:

How to add a string such as "Test" to the beginning of every column?

How to add the value in column 1 to every column in that row, including column 1?

How to add the value in column 1 to every column in that row, not including column 1?

My best guesses:

I do not know how to call "every column" and I do not know how to make the command access the currently column so I can only add a string or the value in column 1 to a single other column:

awk -F'\t' -vOFS='\t' '{ !$1 = "hello" $2}' 
awk -F'\t' -vOFS='\t' '{ !$1 = $1 $2}'

Is there a good resource on where I can learn this syntax?

Best Answer

Just iterate over all fields starting with the second, and concatenate the first field to whatever you already have:

$ awk '{ for(i=2;i<=NF;i++){ $i = $1$i }}1' file
Name1 Name1String111 Name1String112
Name2 Name2String121 Name2String122 Name2String123
Name3 Name3String131 Name3String132 Name3String133 Name3String134

The 1 in the end is awk shorthand for "print the current line". You could write the same thing like this:

$ awk '{ for(i=2;i<=NF;i++){ $i = $1$i }; print}' file
Name1 Name1String111 Name1String112
Name2 Name2String121 Name2String122 Name2String123
Name3 Name3String131 Name3String132 Name3String133 Name3String134

The basic idea above can be trivially expanded to match all of your examples. NF is the special awk variable that holds the number of fields; it will always be set to however many fields are present in the current line. Then, awk allows you to refer to specific fields using a variable. So if you set i=5, then $i is equivalent to $5. This then lets you iterate over all fields using the for(i=2;i<=NF;i++) { } format which sets i to all numbers from 2 to the number of fields on this line.

Related Solutions

text-processing – Easiest Way to Add a String at the Beginning of Every Line from Command Line

You can use sed:

sed -i 's/^/your_string /' your_file

Thanks to Stephane and Marco's comments, note that the -i option isn't POSIX. A POSIX way to do the above would be

sed 's/^/your_string /' your_file > tmp_copy && mv tmp_copy your_file

or perl:

perl -pi -e 's/^/your_string /' your_file

Explanation

Both commands perform a regex substitution, replacing the beginning of a line (^) with your desired string. The -i switch in both commands makes sure the file is edited in place (i.e. the changes are reflected in the file instead of printed to stdout).

sed should be available on any POSIX-compliant OS and perl should be available on most modern Unices except perhaps for the ones that have gone through the effort of removing it.

Python – ‌How to select rows with minimum value in each group based on first column as ID

One approach would be to sort in ascending order, then note the first col2 value for each col1 and print if the current col2 value is equal to it:

sort -k1,1n -k2,2g file | awk '!a[$1] {a[$1] = $2} $2 == a[$1]'
1   7.8e-12
1   7.8e-12
2   9.3e-13
3   3.0e-11
3   3.0e-11

Best Answer

Related Solutions

text-processing – Easiest Way to Add a String at the Beginning of Every Line from Command Line

Python – ‌How to select rows with minimum value in each group based on first column as ID

Related Question