Filesystems – Append Huge Files Without Copying

filesfilesystems

There are 5 huge files ( file1, file2, .. file5) about 10G each and extremely low free space left on the disk and I need to concatenate all this files into one.
There is no need to keep original files, only the final one.

Usual concatenation is going with cat in sequence for files file2 .. file5:

cat file2 >> file1 ; rm file2

Unfortunately this way requires a at least 10G free space I don't have.
Is there a way to concatenate files without actual copying it but tell filesystem somehow that file1 doesn't end at original file1 end and continues at file2 start?

ps. filesystem is ext4 if that matters.

Best Answer

AFAIK it is (unfortunately) not possible to truncate a file from the beginning (this may be true for the standard tools but for the syscall level see here). But with adding some complexity you can use the normal truncation (together with sparse files): You can write to the end of the target file without having written all the data in between.

Let's assume first both files are exactly 5GiB (5120 MiB) and that you want to move 100 MiB at a time. You execute a loop which consists of

  1. copying one block from the end of the source file to the end of the target file (increasing the consumed disk space)
  2. truncating the source file by one block (freeing disk space)

    for((i=5119;i>=0;i--)); do
      dd if=sourcefile of=targetfile bs=1M skip="$i" seek="$i" count=1
      dd if=/dev/zero of=sourcefile bs=1M count=0 seek="$i"
    done
    

But give it a try with smaller test files first, please...

Probably the files are neither the same size nor multiples of the block size. In that case the calculation of the offsets becomes more complicated. seek_bytes and skip_bytes should be used then.

If this is the way you want to go but need help for the details then ask again.

Warning

Depending on the dd block size the resulting file will be a fragmentation nightmare.

Related Question