MacOS – Extremely poor bulk file copy speeds on AFP shares in Yosemite

afpmacosNetworkosx-server

Some hardware background: I manage a 3D graphics lab of about a dozen iMac clients running 10.10.5 and a Mini running Server 10.10.2 (Server.app v4.0.3 / build 14S350). The Mini is in a Sonnet xMac enclosure, which connects it via Thunderbolt to an Areca ARC-1883X SAS RAID controller and a SmallTree P2E10G-1-T 10Gb Ethernet card. The Areca manages two 40TB SAS RAIDs and the SmallTree card connects the Mini via Cat6a to a NetGear ProSafe XS708E 10GbE switch. The iMacs are all wired over 1GbE Cat6 to an HP 1810-48G switch, which is in turn connected over a 6Gb trunk to the NetGear switch.

My artists have been running into an issue with bulk file copies between directories of the AFP share on the Mini that they work out of. They frequently render sequences of hundreds or thousands of images, and after these images get rendered to their output folder they then have to be copied to a second directory for our compositors to work with. The copy operation absolutely CRAWLS. One example, from a half hour ago: 861 .exr files, totaling about 350MB, took about 3 hours before we killed it at ~75% and instead did it from the server's desktop via screen sharing in about 30 seconds (But our artists do this dozens of times a day and of course can't be given access to screen share with the server, so this is not a solution). They don't always hang like that, but we run into such a case at least once every day and all bulk copies go way slower than they should. This only happens with large groups of files: We can copy a single 300MB file between directories pretty much instantaneously.

I've done some tests, and this appears to be a Yosemite client issue more than anything. I run Mountain Lion on my own laptop and did some tests, in 10.8 and 10.10, on wifi and wired ethernet, and in both local and network profiles since our artists sign into network accounts. Some limited results for 300 .exr files totaling 133MB:

10.8 / Wifi / Local profile: 300 items copy in 53 sec

10.8 / Wired / Local profile: 300 items copy in 47 sec

10.10 / Wired / Local profile: 300 items copy in 223 sec

10.10 / Wired / Network profile: 300 items copy in 263 sec

Network accounts are slightly slower, but the big egregious difference seems to be 10.8 client vs 10.10 client. Again, the problem is with long lists of files and not with single monolithic files. Our straight up ethernet speeds to the server are fantastic: In both 10.8 and 10.10 Blackmagic Speed Test I get 110MB/sec+ read and write to the server, and only slightly slower on Wireless N wifi. This only becomes an issue when we need to copy long lists of files, which we need to do many times a day.

ANY help to figure out what's going wrong here would be much appreciated! This is driving us absolutely insane at this point and is killing productivity. Happy to post any requested logs or attempt any suggested system tweaks. Thank you!

Best Answer

Here's how I would attack the issue. It's not an answer, but hopefully we can crowd source ideas until you can report success or at least a way to measure things.

  1. Set up a test case client with no third party apps running at log in. Reboot that client and mount the network share. Run sudo sysdiagnose Finder before you start a copy.
  2. Start a tcp trace on the network adapter you will copy the file. If you aren't connecting over en0 - use System Information to see the BSD name of the network connection.
  3. Once the trace is started, start the copy of the file in question.
  4. After 3 minutes (or less if the transfer is done sooner), press Control+C to end the capture
  5. Run a second sudo sysdiagnose Finder after the network capture

With this slow of a transfer speed, something seriously amiss is happening in the network stack, but without looking at the client logs, it's going to be hard to know for sure what's halting the operation. You might also run a sysdiagnose on the server side once at about the same time as you do on the client side to eliminate a slow server as the issue. It seems you have plenty of horsepower for the storage to move rapidly, but getting server side logs will help too:

sudo sysdiagnose
sudo /Applications/Server.app/Contents/ServerRoot/usr/sbin/serverdiagnose

The trace is:

sudo tcpdump -i en0 -s 0 -B 524288 -w ~/Desktop/AFPslow.pcap

enter image description here