I'm working on an NFS solution for RHEL6.5 clients (all VMs) with RHEL6.5 and RHEL7 hosts. Currently, the RHEL7 host with RHEL6.5 clients works fine. The trouble is with the RHEL6.5 host.
These problems might be down to aspects of the server I can't control, as the server has been having issues lately that it didn't last year. If you think that's the issue, please suggest ways I can prove this to my superiors, and begin the process of getting a new machine.
The solution was initially being crafted to use NFSv4, which was going swell. The RHEL6.5 host, however, is not as keen as the RHEL7 host. Mounts succeed, but file access does not work, e.g. cp
, less
. In terminal, they hang. tail
-ing the client's /var/log/messages
shows state manager: lease expired failed on NFSv4 server nfs_master with error 10018
. Per the standard, that error code is for NFS4ERR_RESOURCE
, documented here. My attempt to resolve the resource issue was by increasing the number of nfsd
processes via the command-line, and by setting the appropriate config in /etc/sysconfig/nfs
. It didn't help. This issue also occurs if the exported directory is mounted on the NFS server itself.
What is not shown in the logs for the host nor client is another error 10022
, or at least I assume this is an NFSv4 error code. This is only viewable when tcpdump
-ing the interface that the NFS communication is going over: IP test-host.nfs > test_client-1.3297002672: reply ok 52 getattr ERROR: unk 10022
If this error code is indeed an NFSv4 one, then it is for NFS4ERR_STALE_CLIENTID
documented here.
When the mount
command is changed to set nfsvers=3
, actions like cp
are successful and generate no errors on the client nor the host. The first attempt will take a little long, 5 seconds maybe, then futures actions are much faster.
At a time there will be at most four clients mounting the export and reading from it, and potentially the same file.
So, my questions are:
- What are the server-side resources being referred to by the
NFS4ERR_RESOURCE
description? - How do I resolve
NFS4ERR_RESOURCE
andNFS4ERR_STALE_CLIENTID
errors? - Why is NFSv3 functioning as expected, but not NFSv4?
nfs-utils
version and release (for both clients and RHEL6.5 host): 1.2.3.39.el6
mount
commands:
mount -n -t nfs -o ro,noexec,timeo=10,retrans=3,retry=0,soft,rsize=32768,intr,noatime
mount -n -t nfs -o nfsvers=3,ro,noexec,timeo=10,retrans=3,retry=0,soft,rsize=32768,intr,noatime
EDIT:
Our resolution for this issue was to fall back to NFSv3 protocol. Everything works just fine. I won't answer this question with a "just fall back to NFSv3", but this issue is probably too niche to ever see an answer.
Best Answer
Try
-fstype=nfs4,rw,intr,hard,proto=tcp,port=2049,acl
as a test and make sure 2049/tcp is open to the client on the server. If there's a firewall in the way it needs to pass 2049/tcp as well.