VMware Cloud Community
rovalent
Contributor
Contributor
Jump to solution

NFS Disconnecting and Reconnecting Continuously

I am running ESX 3.5 U4

My NFS server is running on Solaris 10, and has been working fine for about 1 year now.

Now the esx is dropping the connection to the server intermittently for about 30-60 seconds, reconnects, then drops again later.

Updating to the latest critical patches has not helped.

There is nothing in the logs I can see on the server that point to anything.

When its disconnected, I can ping the host just fine using vmkping.

Any Ideas?

-Ron

Ron
0 Kudos
1 Solution

Accepted Solutions
AndreTheGiant
Immortal
Immortal
Jump to solution

Verify the number of NFS servers in /etc/default/nfs

Andrea

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro

View solution in original post

0 Kudos
8 Replies
jamieorth
Expert
Expert
Jump to solution

So you are not seeing anything in the vmkernel logs? vmkwarning?

Regards...

Jamie

If you found this information useful, please consider awarding points for "Correct" or "Helpful".

Remember, if it's not one thing, it's your mother...

0 Kudos
rovalent
Contributor
Contributor
Jump to solution

I increased the number of nfs processes to 32 (the number of hosts) and things seem to be more stable.

Below is the output in vmkwarning, vmkernel is similar but nothing else new.

May 4 20:49:50 esx-server vmkernel: 73:09:09:27.765 cpu3:1028)WARNING: NFS: 257: Mount: (VM_Pilot) Server (IP_ADDRESS) SERVER.FQDN Volume: (/export/home/VMWare_Pilot) not responding

May 4 21:11:49 esx-server vmkernel: 73:09:31:27.580 cpu0:3257)WARNING: NFS: 281: Mount: (VM_Pilot) Server (IP_ADDRESS) SERVER.FQDN Volume: (/export/home/VMWare_Pilot) OK

May 4 21:12:19 esx-server vmkernel: 73:09:31:57.251 cpu2:1028)WARNING: NFS: 257: Mount: (VM_Pilot) Server (IP_ADDRESS) SERVER.FQDN Volume: (/export/home/VMWare_Pilot) not responding

May 4 21:12:20 esx-server vmkernel: 73:09:31:58.251 cpu0:1037)WARNING: NFS: 1736: Failed to get attributes (I/O error)

May 4 21:12:20 esx-server vmkernel: 73:09:31:58.251 cpu3:1035)WARNING: NFS: 1736: Failed to get attributes (No connection)

May 5 09:11:44 esx-server vmkernel: 73:21:31:23.617 cpu2:1036)WARNING: NFS: 1736: Failed to get attributes (No connection)

May 5 09:11:44 esx-server vmkernel: 73:21:31:23.617 cpu2:1036)WARNING: NFS: 1736: Failed to get attributes (No connection)

May 5 09:11:44 esx-server vmkernel: 73:21:31:23.617 cpu0:1035)WARNING: NFS: 1736: Failed to get attributes (No connection)

May 5 09:11:44 esx-server vmkernel: 73:21:31:23.617 cpu0:1035)WARNING: NFS: 1736: Failed to get attributes (No connection)

May 5 11:05:13 esx-server vmkernel: 73:23:24:52.308 cpu0:3231)WARNING: NFS: 281: Mount: (VM_Pilot) Server (IP_ADDRESS) SERVER.FQDN Volume: (/export/home/VMWare_Pilot) OK

Ron
0 Kudos
jamieorth
Expert
Expert
Jump to solution

I get the NFS:257 and NFS:281 messages in my vmkernel logs as well. Mine occur during a backup to a NFS mount. It looks like you may be having some contention. How many ESX hosts do you have to this NFS mount?

Regards...

Jamie

If you found this information useful, please consider awarding points for "Correct" or "Helpful".

Remember, if it's not one thing, it's your mother...

0 Kudos
rovalent
Contributor
Contributor
Jump to solution

32 hosts...

There are two 16 node clusters and only ISOs are stored on this NFS mount.

Ron
0 Kudos
jamieorth
Expert
Expert
Jump to solution

Just storing ISO's? There shouldn't be a lot of contention for resources there. Hopefully your adjustment to increase the number of NFS processes should help you out.

Regards...

Jamie

If you found this information useful, please consider awarding points for "Correct" or "Helpful".

Remember, if it's not one thing, it's your mother...

AndreTheGiant
Immortal
Immortal
Jump to solution

Verify the number of NFS servers in /etc/default/nfs

Andrea

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
rovalent
Contributor
Contributor
Jump to solution

I set it to 32, the number of esx hosts using it. Is that good? More? Less?

Have not had any issues since then.

Thanks!

Ron
0 Kudos
AndreTheGiant
Immortal
Immortal
Jump to solution

Set to a greater value.

Andrea

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos