I want to transform the short repeated words in columns into numbers.
In the following example I want to change the words (with ONLY 2 LETTERS)
in column 3 for numbers, so that AA
is changed to 2
, AB
or BA
into 1
, BB
into 0
.
The first and second column may also contain AA
, BB
, AB
and BA
. These should not be changed.
Columns are separated by " "
().
Id_animal Id_SNP Allele
ID01 rs01 AB
ID02 rs01 BA
ID03 rs01 AA
ID04 rs01 BB
The wanted output is:
Id_animal Id_SNP Allele
ID01 rs01 1
ID02 rs01 1
ID03 rs01 2
ID04 rs01 0
Best Answer
-i.bak
in place editing and create a backup of original file asinput.bak
-r
extended regex syntaxs/ AA$/ 2/
replace ending character sequence of ' AA' with 2(AB|BA)
either AB or BA;
separates the different substitute operations