I have a directory with a large number of files.
./I_am_a_dir_with_many_subdirs/
Within a script I'd like to find all subdirs in it, to sort them and to output to a bash array. So, I do:
SubdirsArray=(`find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort`)
Executing the script, I get the following error messages:
sort: write failed: standard output: Broken pipe
sort: write error
As explained in this post: probably sort
executes and closes the pipe, before find
completes writing to it. Thus write() command initiated by find
gets an error EPIPE "Broken pipe", OS sends find
a SIGPIPE. Before the SIGPIPE reaches find
, it prints the error message, then gets SIGPIPE and dies.
Questions:
-
So, what does my
SubdirsArray
contain? The Subdirs, thatfind
found, butsort
left unsorted? -
If so, than what would be the way around this issue with broken pipes? Make find write it's results to a temporary file and then make sort read it?
I don't understand, why "it's also nothing to be concerned about" if it happens within a non-interactive shell: why? My
SubdirsArray
contains something unsorted and further in the script, I assume, that its elements are sorted?! -
I get two error messages:
sort: write failed: standard output: Broken pipe sort: write error
In this thread it is suggested, that sort
doesn't have enough space in a temporary directory to sort all the input. But, doesn't it mean, that sort got something from find?!? I'm confused…
Anyways, I tried to use
SubdirsArray=(`find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort -T /home/temp_dir`)
but it didn't help.
P.S.
I'm not sure whether it's important, but I use find|sort
in a multi-processor script: several processors execute the same command at once in the subshells.
Best Answer
The problem is not between
find
andsort
. Thesort
has problem with output, which means the shell is not willing to read as long list in a variable.You'll have to process the input with
while read
…, storing it in temporary file if you need it more than once. With the added advantage, that this splits on newline only, so it correctly handles filenames with spaces which the backtick approach does not.Unfortunately you don't say how you want to use the result, I can't tell you how to exactly rewrite it.
Note, that arrays are not part of POSIX shell specification and there are shells that are noticeably faster than bash, but don't have them. That's why many people, including me, often avoid using them in scripts.