Windows – Are there any Pros/Cons to the /j Robocopy option (unbuffered copying)

bufferiorobocopywindows

Robocopy has a /J command line option recommended for copying large files (it copies using unbuffered I/O).

What (if any) downsides are there?
Any reason this isn't enabled by default? (That's what made me think there MIGHT be downsides.)

Best Answer

Great question.

Unbuffered I/O is a simple file copy from a source location to a destination location. Buffered I/O augments the simple copy to optimize for future reads of (and writes to) the same file by copying the file into the filesystem cache, which is a region of virtual memory. Buffered I/O incurs a performance penalty the first time the file is accessed because it has to copy the file into memory; however, because memory access is faster than disk access, subsequent file access should be faster. The operating system takes care of synchronizing file writes back to disk, and reads can be pulled directly from memory.

The usage note mentions large files vis-à-vis buffered I/O because:

  1. The up-front cost is expensive. The performance penalty with buffered I/O is substantially worse for large files.
  2. You get little in return. Large file blocks don't tend to stay in the cache for very long anyway, unless you have a ton of memory relative to the file size.
  3. It may not avoid disk I/O. Reads and write of large file data blocks increase the probability of requiring disk I/O.
  4. You probably don't need to buffer anyway. Large files tend to be less frequently accessed in practice than smaller files.

So there is a tradeoff, but which is appropriate for you depends on your particular case. If you are zipping up a bunch of files and transmitting the zip to a backup target, unbuffered is the way to go. Copying a bunch of files that were just changed? Buffered should be faster.

Finally, note that file size is not the only factor in deciding between buffered and unbuffered. As with any cache, the filesystem cache is faster but smaller than the source behind it. It requires a cache replacement strategy that governs when to evict items to make room for new items. It loses its benefit when frequently-accessed items get evicted. For example, if you are synchronizing user home directories intraday to a separate location (i.e., while users are actively using the files), buffered I/O would benefit from files already resident in the cache, but may temporarily pollute the cache with stale files; on the other hand, unbuffered would forego any benefit of files already cached. No clear winner in such a case.

Note: this also applies to xcopy /J

See Microsoft's Ask The Performance Team Blog for more.

Related Question