File Sorting – How to Sort Files Alphabetically Before Processing

find

I use the command

find . -type f -exec sha256sum {} \; > sha256SumOutput

to hash every file in a folder hierarchy. Unfortunately, sha256sum doesn't get the file names from find in alphabetical oder. How can this be fixed?

I'd like to have them ordered before they are hashed so they are hashed in alphabetical order (this has a reason).

Best Answer

Using some pipes and sort

find . -type f -print0 | sort -z | xargs -r0 sha256sum > sha256SumOutput

Explanation

From man find

   -print0
        True; print the full file name on the standard output, followed
        by a null character (instead of the newline character that -print
        uses). This allows file names that contain newlines or other
        types of white space to be  correctly  interpreted by programs
        that process the find output.  This option corresponds to the -0
        option of xargs.

From man sort

   -z, --zero-terminated
        line delimiter is NUL, not newline

From man xargs

   -0   
        Input items are terminated by a null character instead of by
        whitespace, and the quotes and backslash are not special (every
        character is taken literally).  Disables the end of file string,
        which is treated like any  other  argument. Useful when input
        items might contain white space, quote marks, or backslashes.
        The GNU find -print0 option produces input suitable for this mode.

Example

% ls -laog
total 4288
drwxrwxr-x  2 4329472 Aug 17 08:20 .
drwx------ 57   20480 Aug 17 08:20 ..
-rw-rw-r--  1       0 Aug 17 08:15 a
-rw-rw-r--  1       0 Aug 17 08:15 a b
-rw-rw-r--  1       0 Aug 17 08:15 b
-rw-rw-r--  1       0 Aug 17 08:15 c

% find -type f -print0 | sort -z | xargs -r0 sha256sum                  
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  ./a
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  ./a b
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  ./b
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  ./c

The values in the first column are the same, as the files doesn' have any content in my test.

Related Question