I am running ESX 3.5 U4
My NFS server is running on Solaris 10, and has been working fine for about 1 year now.
Now the esx is dropping the connection to the server intermittently for about 30-60 seconds, reconnects, then drops again later.
Updating to the latest critical patches has not helped.
There is nothing in the logs I can see on the server that point to anything.
When its disconnected, I can ping the host just fine using vmkping.
Any Ideas?
-Ron
Verify the number of NFS servers in /etc/default/nfs
Andrea
So you are not seeing anything in the vmkernel logs? vmkwarning?
Regards...
Jamie
If you found this information useful, please consider awarding points for "Correct" or "Helpful".
Remember, if it's not one thing, it's your mother...
I increased the number of nfs processes to 32 (the number of hosts) and things seem to be more stable.
Below is the output in vmkwarning, vmkernel is similar but nothing else new.
May 4 20:49:50 esx-server vmkernel: 73:09:09:27.765 cpu3:1028)WARNING: NFS: 257: Mount: (VM_Pilot) Server (IP_ADDRESS) SERVER.FQDN Volume: (/export/home/VMWare_Pilot) not responding
May 4 21:11:49 esx-server vmkernel: 73:09:31:27.580 cpu0:3257)WARNING: NFS: 281: Mount: (VM_Pilot) Server (IP_ADDRESS) SERVER.FQDN Volume: (/export/home/VMWare_Pilot) OK
May 4 21:12:19 esx-server vmkernel: 73:09:31:57.251 cpu2:1028)WARNING: NFS: 257: Mount: (VM_Pilot) Server (IP_ADDRESS) SERVER.FQDN Volume: (/export/home/VMWare_Pilot) not responding
May 4 21:12:20 esx-server vmkernel: 73:09:31:58.251 cpu0:1037)WARNING: NFS: 1736: Failed to get attributes (I/O error)
May 4 21:12:20 esx-server vmkernel: 73:09:31:58.251 cpu3:1035)WARNING: NFS: 1736: Failed to get attributes (No connection)
May 5 09:11:44 esx-server vmkernel: 73:21:31:23.617 cpu2:1036)WARNING: NFS: 1736: Failed to get attributes (No connection)
May 5 09:11:44 esx-server vmkernel: 73:21:31:23.617 cpu2:1036)WARNING: NFS: 1736: Failed to get attributes (No connection)
May 5 09:11:44 esx-server vmkernel: 73:21:31:23.617 cpu0:1035)WARNING: NFS: 1736: Failed to get attributes (No connection)
May 5 09:11:44 esx-server vmkernel: 73:21:31:23.617 cpu0:1035)WARNING: NFS: 1736: Failed to get attributes (No connection)
May 5 11:05:13 esx-server vmkernel: 73:23:24:52.308 cpu0:3231)WARNING: NFS: 281: Mount: (VM_Pilot) Server (IP_ADDRESS) SERVER.FQDN Volume: (/export/home/VMWare_Pilot) OK
I get the NFS:257 and NFS:281 messages in my vmkernel logs as well. Mine occur during a backup to a NFS mount. It looks like you may be having some contention. How many ESX hosts do you have to this NFS mount?
Regards...
Jamie
If you found this information useful, please consider awarding points for "Correct" or "Helpful".
Remember, if it's not one thing, it's your mother...
32 hosts...
There are two 16 node clusters and only ISOs are stored on this NFS mount.
Just storing ISO's? There shouldn't be a lot of contention for resources there. Hopefully your adjustment to increase the number of NFS processes should help you out.
Regards...
Jamie
If you found this information useful, please consider awarding points for "Correct" or "Helpful".
Remember, if it's not one thing, it's your mother...
Verify the number of NFS servers in /etc/default/nfs
Andrea
I set it to 32, the number of esx hosts using it. Is that good? More? Less?
Have not had any issues since then.
Thanks!
Set to a greater value.
Andrea