Shell – Split file and know how many files were generated

filesshellsplit

I'm using the following lines to split a file into smaller parts:

split --line-bytes=100M -d $input $output/FILENAME
echo "$input was split into ??? 100MB files." >> demo.log

After that, I need to write in a log file how many smaller files were generated from this split. Is there any way to do that?

Best Answer

The easiest way is to simply save the resulting pieces names in an array e.g.

splitarr=($output/FILENAME*)

and get the array length (number of elements) with ${#splitarr[@]}. This assumes the only filenames matching that pattern are those produced by the split command.


You appear to be using gnu split so here are some other ways to do it: you could add the --verbose option (see man page for details) and just count the lines that split prints to stdout and save that into a variable:

ct=$(split --verbose --line-bytes=100M -d $input $output/FILENAME | wc -l)

You could get the same result with the less known option --filter:

ct=$(split --filter='printf %s\\n;cat >$FILE' --line-bytes=100M -d $input $output/FILENAME | wc -l)

Alternatively, if you know that only your split command will create files in that directory in the next N seconds you could use inotifywatch to gather statistics for e.g close_write event:

inotifywatch . -t 20 -e close_write

will watch the current dir for close_write events for the next 20 seconds and will output something like:

Establishing watches...
Finished establishing watches, now collecting statistics.
total  close_write  filename
11     11           ./

so it's only a matter of extracting that number from the table (e.g. pipe it to awk 'END{print $2}'; also keep in mind the first two lines are printed on stderr)