How to calculate the correct size of a loopback device filesystem image for debootstrap

dddebootstrapdisk-usageext4loop-device

I'm using debootstrap to create a rootfs for a device that I want to then write to an image file. To calculate the size needed from my rootfs, I do the following:

local SIZE_NEEDED=$(du -sb $CHROOT_DIR|awk '{print $1}')
SIZE_NEEDED=$(($SIZE_NEEDED / 1048576 + 50)) # in MB + 50 MB space
dd if=/dev/zero of=$ROOTFS_IMAGE bs=1M count=$SIZE_NEEDED

As you can see I'm leaving 50MB of padding beyond what dd calculates I need.

I then create the loopback device, create a partition table and filesystem:

LO_DEVICE=$(losetup --show -f $ROOTFS_IMAGE)
parted $LO_DEVICE mktable msdos mkpart primary ext4 0% 100%
partprobe $LO_DEVICE
local LO_ROOTFS_PARTITION="${LO_DEVICE}p1"
mkfs.ext4 -O ^64bit $LO_ROOTFS_PARTITION

It seems parted attempts to do some sector alignment (?) as the partition doesn't quite take up the whole virtual disk, but close enough.

I then mount the new partition and start writing files. But then I run out of disk space right near the end!

mount $LO_ROOTFS_PARTITION $LO_MOUNT_POINT
cp -rp $CHROOT_DIR/* $LO_MOUNT_POINT

.....
cp: cannot create directory '/root/buildimage/rootfs_mount/var': No space left on device

I suspect this is some block size conversion issue or maybe difference between MiB and MB? Because up to a certain image size, it seems that I have enough headroom with the 50MB of padding. (I want some free space in the image by default, but not a lot.) The image size isn't off by a factor-of-two so there's some creep or overhead that gets magnified as the image size gets larger and I'm not sure where it's coming from.

For context, here's the last one I did that doesn't fit:

# du -sb build/rootfs
489889774   build/rootfs

Ok, 489MB/1024**2 + 50MB = 517MB image size. So dd looked like:

# dd if=/dev/zero of=build/rootfs.img size=1M count=517
517+0 records in
517+0 records out
542113792 bytes (542 MB, 517 MiB) copied, 2.02757 s, 267 MB/s

Confirmed on disk it looks slightly larger:

# du -sb build/rootfs.img
542113792   build/rootfs.img

The partition looks like:

# parted /dev/loop0 print
Model: Loopback device (loopback)
Disk /dev/loop0: 542MB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End    Size   Type     File system  Flags
 1      1049kB  542MB  541MB  primary  ext4

and mounted filesystem:

# df -h /dev/loop0p1
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop0p1    492M  482M     0 100% /root/buildimage/build/rootfs_mount

So maybe there is overhead in the ext4 filesystem, possibly for superblocks/ journal/ etc? How can I account for that in my size calculation?

EDIT:

Looking into ext4 overhead such as this ServerFault question.

Also looking into mkfs.ext4 options such as -m (reserved) and various journaling and inode options. In general if I know there's a 5% overhead coming from the filesystem, I can factor that in easily enough.

EDIT #2:

Thinking that du might be under-reporting actual on-disk size requirements (e.g. a 10-byte file still takes up a 4k block, right?) I tried a few other options:

# du -sb build/rootfs        # This is what I was using
489889774   build/rootfs

# du -sm build/rootfs        # bigger
527 build/rootfs

# du -sk build/rootfs        # bigger-est
539088  build/rootfs

Furthermore, the manpage for -b notes that it's an alias for --apparent-size which can be smaller than "actual disk usage." So that may be (most) of where my math was wrong.

Best Answer

Possibly the simplest solution is to heavily overprovision the space initially, copy all the files, then use resize2fs -M to reduce the size to the minimum this utility can manage. Here's an example:

dir=/home/meuh/some/dir
rm -f /tmp/image
size=$(du -sb $dir/ | awk '{print $1*2}')
truncate -s $size /tmp/image
mkfs.ext4 -m 0 -O ^64bit /tmp/image
sudo mount /tmp/image /mnt/loop
sudo chown $USER /mnt/loop
rsync -a $dir/ /mnt/loop
sync
df /mnt/loop
sudo umount /mnt/loop
e2fsck -f /tmp/image 
resize2fs -M /tmp/image 
newsize=$(e2fsck -n /tmp/image | awk -F/ '/blocks$/{print $NF*1024}')
truncate -s $newsize /tmp/image
sudo mount /tmp/image /mnt/loop
df /mnt/loop
diff -r $dir/ /mnt/loop
sudo umount /mnt/loop

Some excerpts from the output for an example directory:

+ size=13354874
Creating filesystem with 13040 1k blocks and 3264 inodes
+ df /mnt/loop
Filesystem     1K-blocks  Used Available Use% Mounted on
/dev/loop1         11599  7124      4215  63% /mnt/loop
+ resize2fs -M /tmp/image
Resizing the filesystem on /tmp/image to 8832 (1k) blocks.
+ newsize=9043968
+ truncate -s 9043968 /tmp/image
+ df /mnt/loop
Filesystem     1K-blocks  Used Available Use% Mounted on
/dev/loop1          7391  7124        91  99% /mnt/loop
Related Question