The aim is to add leading zeros until all lines before the comma consist of nine characters and subsequently insert a character every three digits using sed
.
Input
12345,1s4c3v6s3nh6
123456789,9h5vgbdx34dc
12,7h4f45dcvbgh
1234567,09klijnmh563
Current outcome
[vagrant@localhost ~]$ sed -e 's/\([0-9]\{3\}\),/\/\1\//g' file
12/345/1s4c3v6s3nh6
123456/789/9h5vgbdx34dc
12,7h4f45dcvbgh
1234/567/09klijnmh563
Expected outcome
000/012/345,1s4c3v6s3nh6
123/456/789,9h5vgbdx34dc
000/000/012,7h4f45dcvbgh
001/234/567,09klijnmh563
Note:
12345
needs to become000012345
and12
should result in
000000012
. In short, the emphasis is on the number sequence before
the comma.- The format of the lines is always
MAX_9_characters,fixed_12_characters
. I.e., e.g.,
1234512345,1s4c3v6s3nh6
will never reside in the input file.
The problem is that the number of characters could not be equalized using sed. How could this be accomplished?
Best Answer
If your input don't have long sequence number in second field, try:
Explanation
s|^[^,]*|#000000000&|
: we match all thing from start to the first,
, replace it with a maker#
and n numbers 0, where n is length we want to pad.s|#[^,]*\(.\{9\}\),|\1,|
: we match all thing from the marker to the first,
, only keep the last 9 characters before,
, discard the rest.s|\([0-9]\{3\}\)|\1/|g
: add a/
each 3 sequence of digits.s|/\([^0-9]\)|\1|;s|/$||
: if after/
is not a number or/
is at the end of line, we remove it.or easier with
perl
: