Cut with 2 character delimiter

awkcut

I wanted to use cut to with a 2 charachter delimeter to process a file with many lines like this:

1F3C6..1F3CA
1F3CF..1F3D3
1F3E0..1F3F0

But cut only allows a single character.

Instead of cut -d'..' I'm trying awk -F'..' "{echo $1}" but it's not working.

My script:

wget -O output.txt http://www.unicode.org/Public/emoji/6.0/emoji-data.txt                                                                             
sed -i '/^#/ d' output.txt                        # Remove comments                                                                                   
cat output.txt | cut -d' ' -f1 | while read line ;                                                                                                    
  do echo $line | awk -F'..' "{echo $1}"                                                                                                             
done  

Best Answer

Sample test script that works for me:

#!/bin/sh

raw="1F3C6..1F3CA
1F3CF..1F3D3
1F3E0..1F3F0"

for r in $raw
do
    f1=`echo "${r}" | cut -d'.' -f1`
    f2=`echo "${r}" | cut -d'.' -f2`
    f3=`echo "${r}" | cut -d'.' -f3`
    echo "field 1:[${f1}] field 2:[${f2}] field 3:[${f3}]"
done

exit

And the output is:

field 1:[1F3C6] field 2:[] field 3:[1F3CA]
field 1:[1F3CF] field 2:[] field 3:[1F3D3]
field 1:[1F3E0] field 2:[] field 3:[1F3F0]

Edit

After reading Stéphane Chazelas comment and linked Q&A, I re-wrote the above to remove the loop.

I could not work out a way to remove the loop and keep the parts as variables (for example; $f1, $f2 and $f3 in my original answer) that could be passed around. Still I don't know what was required output in the original question.

First, still using cut:

#!/bin/sh
raw="1F3C6..1F3CA
1F3CF..1F3D3
1F3E0..1F3F0"

printf '%s\n' "${raw}" | cut -d'.' -f1,3

Which will output:

1F3C6.1F3CA
1F3CF.1F3D3
1F3E0.1F3F0

Could replace the displayed . with any string using the --output-delimiter=STRING.

Next, with sed instead of cut in order to give more control of the output:

#!/bin/sh
raw="1F3C6..1F3CA
1F3CF..1F3D3
1F3E0..1F3F0"

printf '%s\n' "${raw}" | sed 's/^\(.*\)\.\.\(.*\)$/field 1 [\1] field 2 [\2]/'

And this will render:

field 1 [1F3C6] field 2 [1F3CA]
field 1 [1F3CF] field 2 [1F3D3]
field 1 [1F3E0] field 2 [1F3F0]