Tool that hooks to two directories and synchronizes them as soon as something changes

directorysynchronization

I have huge simulation directories on the two machines server and on cluster which I want to keep synchronized. The basic situation is that files rarely change but when a simulation is done many big files change at once which I'd like to have synchronized as soon as they are closed.

Therefore I assume a cronjob using rsync is not ideal, cronjob because it calls rsync at a fixed time interval which I'd probably choose either too large or to small, rsync because it shouldn't have to check for modifications of files since I and the simulation job are the only ones accessing them.

So my idea would be using inotify (see this question) to detect changes in the simulation directory in a loop and then fork into a rsync for the changed files. However I'm not sure whether that might accidentally skip files closed just when the fork is done. (and possibly I'd also end up in an infinite loop due to inotify calling rsync to sync the just synchronized file again…) So before I try too much there, I repeat my question:

Is there a tool that hooks to two directories and synchronizes them as soon as something changes?

(basically this would be something like an offline dropbox, I guess)

Best Answer

Check out lsyncd.

Lsyncd watches a local directory trees event monitor interface (inotify). It aggregates and combines events for a few seconds and then spawns one (or more) process(es) to synchronize the changes. By default this is rsync. Lsyncd is thus a light-weight live mirror solution that is comparatively easy to install not requiring new filesystems or blockdevices and does not hamper local filesystem performance.

It's not two-way, but from your question I understood you don't need that either. If you need two-way synchronization, Unison is good answer, except there is no inotify support. Also, check out this question.

Third thing for two-way synchronization is DRBD, block-level realtime synchronization system, included in mainline kernel. Unfortunately, as it is almost synchronous, it requires fast internet connection.

Related Question