Edge cases in filesystem operations during backup and file copy

backupconcurrencyfilesystemsrsync

I have a question about file access in Linux that I cannot sum it up completely.

Consider that I have a home directory that I backup it using rsync triggered via cron. My home directory is on an EXT4 file system and I'm logged in during the rsync run.

My question is what happens if a file is modified during mid-backup (while rsync is reading it). AFAIK EXT fs family doesn't have any measures against it and that will corrupt (or render it meaningless) the backup of the file I just took.

Is my theory correct or am I missing a small locking mechanism which guarantees sound backups.

Best Answer

Using plain rsync files are read the same way any application would read a file. This can lead to copies with inconsistent data. The best way to prevent inconsistent copies is using LVM snapshots, which will prevent changes to the data while copying.

From my experience, getting inconsistent data is rare, likely due to the way the the kernel buffers writes. Only very heavily written files, such as database files, result in corruption.

Related Question