Mac – Why is a directory copied with the cp command smaller than the original

command linecpdumac

I am tying to copy one directory with a large number of files to another destination. I did:

cp -r src_dir another_destination/

Then I wanted to confirm that the size of the destination directory is the same as the original one:

du -s src_dir
3782288 src_dir

du -s another_destination/src_dir
3502320 another_destination/src_dir

Then I had a thought that there might be several symbolic links that are not followed by the cp command and added the -a flag:

-a Same as -pPR options. Preserves structure and attributes of files but not directory structure.

cp -a src_dir another_destination/

but du -s gave me the same results. It is interesting that both the source and destination have the same number of files and directories:

tree src_dir | wc -l
    4293

tree another_destination/src_dir | wc -l
    4293

What am I doing wrong that I get different sizes with the du command?

UPDATE

When I try to get sizes of individual directories with the du command I get different results:

du -s src_dir/sub_dir1
1112    src_dir/sub_dir1

du -s another_destination/src_dir/sub_dir1
1168    another_destination/src_dir/sub_dir1

When I view files with ls -la, individual file sizes are the same but totals are different:

ls -la src_dir/sub_dir1
total 1168
drwxr-xr-x     5 hirurg103  staff     160 Jan 30 20:58 .
drwxr-xr-x  1109 hirurg103  staff   35488 Jan 30 21:43 ..
-rw-r--r--     1 hirurg103  staff  431953 Jan 30 20:58 file1.pdf
-rw-r--r--     1 hirurg103  staff  126667 Jan 30 20:54 file2.png
-rw-r--r--     1 hirurg103  staff    7386 Jan 30 20:49 file3.png

ls -la another_destination/src_dir/sub_dir1
total 1112
drwxr-xr-x     5 hirurg103  staff     160 Jan 30 20:58 .
drwxr-xr-x  1109 hirurg103  staff   35488 Jan 30 21:43 ..
-rw-r--r--     1 hirurg103  staff  431953 Jan 30 20:58 file1.pdf
-rw-r--r--     1 hirurg103  staff  126667 Jan 30 20:54 file2.png
-rw-r--r--     1 hirurg103  staff    7386 Jan 30 20:49 file3.png

Best Answer

That is because du by default shows not the size of the file(s), but the disk space that they are using. You need to use the -b option to get sum of file sizes, instead of total of disk space used. For example:

% printf test123 > a
% ls -l a
-rw-r--r-- 1 mnalis mnalis 7 Feb  1 19:57 a
% du -h a
4,0K    a
% du -hb a
7       a

Even though the file is only 7 bytes long, it will occupy a whole 4096 bytes of disk space (in my particular example; it will vary depending on the filesystem used, cluster size etc).

Also, some filesystems support so-called sparse files, which do not use any disk space for blocks which are all zeros. For example:

% dd if=/dev/zero of=regular.bin bs=4k count=10
10+0 records in
10+0 records out
40960 bytes (41 kB, 40 KiB) copied, 0,000131003 s, 313 MB/s
% cp --sparse=always regular.bin sparse.bin
% ls -l *.bin
-rw-r--r-- 1 mnalis mnalis 40960 Feb  1 20:04 regular.bin
-rw-r--r-- 1 mnalis mnalis 40960 Feb  1 20:04 sparse.bin
% du -h *.bin
40K     regular.bin
0       sparse.bin
% du -hb *.bin
40960   regular.bin
40960   sparse.bin

In short, to verify all files were copied, you'd use du -sb instead of du -s.

Related Question