I've got a filesystem which has a couple million files and I'd like to see a distribution of file sizes recursively in a particular directory. I feel like this is totally doable with some bash/awk fu, but could use a hand. Basically I'd like something like the following:
1KB: 4123
2KB: 1920
4KB: 112
...
4MB: 238
8MB: 328
16MB: 29138
Count: 320403345
I feel like this shouldn't be too bad given a loop and some conditional log2 filesize foo, but I can't quite seem to get there.
Related Question: How can I find files that are bigger/smaller than x bytes?.
Best Answer
This seems to work pretty well:
Its output looks like this:
where the number on the left is the lower limit of a range from that value to twice that value and the number on the right is the number of files in that range.