tl;dr I would like to reserve (or "claim"?) some amount of disk space before an rsync
occurs so other rsync
instances will only run if the disk space needed will certainly be available.
background
A job (a shell script that runs rsync
) will:
- Use
rsync
to copy large amount of data from a source disk to a different destination disk - do some work using the copied data
- remove the copied data
Multiple instances of the job script may run simultaneously.
In my case, once in a while, many job scripts simultaneously rsync
and use all available disk space. All of the rsync
instances fail (and so the jobs fail).
pseudo-code
Here is the algorithm I'm imagining:
$job = get_next_incoming_job()
$disk_dst = $job.disk_dst() # destination disk for rsync
$space_need = $job.calculate_space_needed()
_check_space: # jump label
if $space_need > space_available($disk_dst) then
sleep $RANDOM
goto _check_space:
$handle = reserve_space($disk_dst, $space_need) # How??
# rsync will "fill-in" the reserved space - How??
rsync $job.source_data_path() $disk_dst/$job.ID/
do work using $disk_dst/$job.ID/
remove $disk_dst/$job.ID/
release_reserved_space($handle) # How??
The magic function reserve_space
would instantly change the $disk_dst
reported free space (value returned by space_available
). Other rsync
job instances would see space_available()
return less space right away (and thus, delay their work until later).
Currently, space_available()
(via actual program df
) will return a declining number while rsync
instances run. The problem is multiple rsync
instances can run out of space while running. I'd like the rsync
instances to only run when it is certain they can complete (i.e. not run out of disk space while running).
Best Answer
If you stick to filesystem-independent tools, I can't think of a way to do this other than actually allocating the disk space, i.e.
reserve
would need to create a (non-sparse!) file of the requested size, and you'd need to delete this file before startingrsync
.If the files are on an ext2/ext3/ext4 volume and using root access for some operations is acceptable, you can use its reserved space feature. The reserved space is normally for root, but you can make it available to a different user or to a different group instead. Run the rsync process as that user/group and adjust the reserved space with
tune2fs -m
before running rsync.There's probably a more flexible solution with ZFS or Btrfs pools but I don't know how to do it.