Sort by regular expressions

regular expressionsort

I've got a set of POSIX regular expressions*

^BEGIN:VCARD\r$
^VERSION[^A-Z]
^FN[^A-Z]
^N[^A-Z]
^NICKNAME[^A-Z]
^EMAIL[^A-Z]
^X-\([A-Z-]*\)
^TEL[^A-Z]
^ADR[^A-Z]
^ORG[^A-Z]
^TITLE[^A-Z]
^BDAY[^A-Z]
^URL[^A-Z]
^ROLE[^A-Z]
^NOTE[^A-Z]
^END:VCARD\r$

and a file with lines which each match one of the regular expressions:

BEGIN:VCARD
VERSION:3.0
N:Doe;Jane;;Ms;
URL:http://janedoe.com/
EMAIL:jdoe@example.org
EMAIL:jane.doe@janedoe.com
BDAY:1970-01-01
X-JABBER:jane.doe@example.org
X-ICQ:1234567890
END:VCARD

I'd like to sort these lines according to

  1. the line number of the regex match (so that lines starting with FN comes before lines starting with N),
  2. the match group (so that X-ABC comes before X-DEF)

Ideally, the other parts of the lines should not be sorted (so the sequence of lines which start with EMAIL should be left alone). The expected result should therefore be:

BEGIN:VCARD
VERSION:3.0
N:Doe;Jane;;Ms;
EMAIL:jdoe@example.org
EMAIL:jane.doe@janedoe.com
X-ICQ:1234567890
X-JABBER:jane.doe@example.org
BDAY:1970-01-01
URL:http://janedoe.com/
END:VCARD

Is there an existing tool to do this?

Edit: Resulting implementation based on Lars Rohrbach's answer.

* This is the sequence of vCard properties in a Gmail contacts export file.

Best Answer

The usual sort command doesn't provide an included way to specify your specific "dictionary", and while the grep command allows you to provide a file of regular expressions, it won't change the order of the output. But you can put both together in a simple foreach loop -- here's an example that works in the bash shell:

for i in `cat fileofregexp`; do grep "$i" myinputfile; done

This takes each regexp line from your file of regular expressions one by one, and outputs any match from your inputfile, so the resulting output will be sorted by your regexp order. Note that any lines in your inputfile that don't match at all will not make it to the output.

Addendum: As requested, here's a version using a while loop:

while IFS= read -r i; do grep "$i" myinputfile; done  < fileofregexp
Related Question