Windows Explorer sorting order for special characters

filenamessortingwindows

What is the sorting order used in Windows Explorer?

I was specifically wondering what special characters sort after the alphabets?

As far as I can tell (test), all special characters seem to be sorted before the alphabets. But I couldn't identify the order. (for. eg. '@' comes after '%' which is not that way on the keyboard)

Screenshot

Are there any special characters that would be sorted after the alphabets?

Best Answer

I did some testing and the overall ordering seems to be as follows...

Symbols
Latin (ordered by Unicode value (U+xxxx))
Greek (ordered by Unicode value (U+xxxx))
Cyrillic (ordered by Unicode value (U+xxxx))
Hebrew (ordered by Unicode value (U+xxxx))
Arabic (ordered by Unicode value (U+xxxx))

Numbers
Latin (ordered by Unicode value (U+xxxx))
Greek (ordered by Unicode value (U+xxxx))
Cyrillic (ordered by Unicode value (U+xxxx))
Hebrew (ordered by Unicode value (U+xxxx))
Arabic (ordered by Unicode value (U+xxxx))

Letters
Latin (ordered by Unicode value (U+xxxx))
Greek (ordered by Unicode value (U+xxxx))
Cyrillic (ordered by Unicode value (U+xxxx))
Hebrew (ordered by Unicode value (U+xxxx))
Arabic (ordered by Unicode value (U+xxxx))

Sorting Rule Sequence vs Observed Order

It's worth noting that there are really two ways of looking at this. Ultimately, what you have are sorting rules that are applied in a certain order, in turn, this produces an observed order. The ordering of older rules becomes nested under the ordering of newer rules. This means that the first rule applied is the last rule observed, while the last rule applied is the first or topmost rule observed.

Sorting Rule Sequence

1.) Sort on Unicode Value (U+xxxx)
2.) Sort on culture/language
3.) Sort on Type (Symbol, Number, Letter)

Observed Order

The highest level of grouping is by type in the following order...

1.) Symbols
2.) Numbers
3.) Letters

Therefore, any symbol from any language comes before any number from any language, while any letter from any language appears after all symbols and numbers.
The second level of grouping is by culture/language. The following order seems to apply for this:

Latin
Greek
Cyrillic
Hebrew
Arabic
The lowest rule observed is Unicode order, so items within a type-language group are ordered by Unicode value (U+xxxx).

Related Solutions

Windows – Sort order in Windows Explorer

By default, the newer sort order considers strings in file and folder names as numeric content, not text. Numerals in folder and file names are sorted according to their numeric value.

In the following example, note how the following files, whose names contain numerals, are sorted.

Windows Vista, Windows XP, and Windows Server 2003

5.txt
11.txt
88.txt

In this example, 88 is a numerically higher value than 5. Therefore, the 88.txt is listed after the 5.txt when you sort the folders by name in ascending order.

Source: The sort order for files and folders whose names contain numerals is different in Windows Vista, Windows XP, and Windows Server 2003 than it is in Windows 2000

How to do custom sorting using unix sort

The other answer and comment answer the question in general, here's how an implementation can look like:

$ cat order
Bahamas,3
Canada,2
United States,1

$ cat data
C,United States,WA,Tacoma,f,1
A,United States,MA,Boston,f,0
B,United States,NY,New York,f,5
A,Canada,QC,Montreal,f,2
A,Bahamas,Bahamas,Nassau,f,2
A,United States,NY,New York,f,1

$ sort -t, -k2 data | join -t, -11 -22 order - | sort -t, -k2n -k4,5 -k6r -k7nr | cut -d, -f 3,1,4-7
A,United States,MA,Boston,f,0
B,United States,NY,New York,f,5
A,United States,NY,New York,f,1
C,United States,WA,Tacoma,f,1
A,Canada,QC,Montreal,f,2
A,Bahamas,Bahamas,Nassau,f,2

Best Answer

Related Solutions

Windows – Sort order in Windows Explorer

How to do custom sorting using unix sort

Related Question