Does this sort ignore the +/- character prefix

sort

I'm trying to sort a patch to highlight a particular change:

$ curl -s https://lists.fedorahosted.org/archives/list/sssd-devel@lists.fedorahosted.org/message/ZN6VMFN65JWV7NMG2XEHPUI2AGSLRNGW/attachment/2/0001-LDAP-Change-the-default-rfc2307-autofs-attribute-map.patch | \
grep '^[+-] *{ "' | sort

The output is:

-    { "ldap_autofs_entry_object_class", "automount", SYSDB_AUTOFS_ENTRY_OC, NULL },
+    { "ldap_autofs_entry_object_class", "nisObject", SYSDB_AUTOFS_ENTRY_OC, NULL },
-    { "ldap_autofs_entry_value", "automountInformation", SYSDB_AUTOFS_ENTRY_VALUE, NULL },
+    { "ldap_autofs_entry_value", "nisMapEntry", SYSDB_AUTOFS_ENTRY_VALUE, NULL },
+    { "ldap_autofs_map_name", "nisMapName", SYSDB_AUTOFS_MAP_NAME, NULL },
-    { "ldap_autofs_map_name", "ou", SYSDB_AUTOFS_MAP_NAME, NULL },
-    { "ldap_autofs_map_object_class", "automountMap", SYSDB_AUTOFS_MAP_OC, NULL },
+    { "ldap_autofs_map_object_class", "nisMap", SYSDB_AUTOFS_MAP_OC, NULL },

I would have expected the sort to sort by the first character +/-.

I can confirm the characters are consistently 0x2b and 0x2d:

$ curl -s https://lists.fedorahosted.org/archives/list/sssd-devel@lists.fedorahosted.org/message/ZN6VMFN65JWV7NMG2XEHPUI2AGSLRNGW/attachment/2/0001-LDAP-Change-the-default-rfc2307-autofs-attribute-map.patch | grep '^[+-] *{ "' | cut -c1 | hexdump -C

00000000  2d 0a 2d 0a 2b 0a 2b 0a  2d 0a 2b 0a 2d 0a 2b 0a  |-.-.+.+.-.+.-.+.|
00000010

sort -d gives the same result. -d says alphanumeric, and +/- are not be alphanumeric. sort -n also doesn't work (I wouldn't expect it to.)

I've been using Linux/Unix for longer than I care to admit, and I've never noticed this before!

…Is this expected? Is there another way using sort? (I know it can also be done in a Perl one liner…)

Best Answer

As mentioned in Why does ls sorting ignore non-alphanumeric characters?, the default collation is UTF8 and UTF8 considers +/- equivalent.

By setting LC_COLLATE=C for the sort, you can get the ascii sort order:

curl -s https://lists.fedorahosted.org/archives/list/sssd-devel@lists.fedorahosted.org/message/ZN6VMFN65JWV7NMG2XEHPUI2AGSLRNGW/attachment/2/0001-LDAP-Change-the-default-rfc2307-autofs-attribute-map.patch | \
grep '^[+-] *{ "' | LC_COLLATE=C sort
Related Question