unable to mount NFS share on ESXi 5.1

tplane · ‎09-05-2014

I'm trying to mount a NFS share from a CentOS 6.5 box to ESXi 5.1 host. No matter changes I make to solve the issue I keep getting the same error:

Interesting things is it doesn't matter what change I make to NFS server or ESXi host I get the same error always and from the start, its never changed, which makes me think its the ESXi. I can't seem to generate any new/other errors.

Note: NFS share works fine when mounted to another CentOS server (192.168.31.20)

cfg info:

NFS server = 192.168.31.25

ESXi server = 192.168.31.15

ESXi>Security Profile > Firewall

VMkernal Port:

# cat /etc/exports

/vmdata/ 192.168.31.30(rw,sync,no_root_squash,all_squash) 192.168.31.15(rw,sync,no_root_squash,all_squash) 192.168.31.20(rw,sync,no_root_squash,all_squash)

troubleshooting info:

vmkping -c 4 192.168.31.25

PING 192.168.31.25 (192.168.31.25): 56 data bytes

64 bytes from 192.168.31.25: icmp_seq=0 ttl=64 time=1.089 ms

64 bytes from 192.168.31.25: icmp_seq=1 ttl=64 time=0.355 ms

64 bytes from 192.168.31.25: icmp_seq=2 ttl=64 time=0.305 ms

64 bytes from 192.168.31.25: icmp_seq=3 ttl=64 time=0.361 ms

~ # nc -z 192.168.31.25 111

Connection to 192.168.31.25 111 port [tcp/sunrpc] succeeded!

~ # nc -z 192.168.31.25 2049

Connection to 192.168.31.25 2049 port [tcp/nfs] succeeded!

192.168.31.25 - # tcpdump -v -i eth0 src 192.168.31.15 (0 traffic when src 192.168.31.30) Note: same traffic on every attempt no matter the cfg change.

08:31:31.707986 IP (tos 0x0, ttl 64, id 50937, offset 0, flags [DF], proto TCP (6), length 60)

vmhost1.gogetit.tplane.com.56635 > centsvr.gogetit.tplane.com.nfs: Flags [S], cksum 0xdfb1 (correct), seq 191267594, win 65535, options [mss 1460,nop,wscale 9,sackOK,TS val 76361881 ecr 0], length 0

08:31:31.708237 IP (tos 0x0, ttl 64, id 50938, offset 0, flags [DF], proto TCP (6), length 52)

vmhost1.gogetit.tplane.com.56635 > centsvr.gogetit.tplane.com.nfs: Flags [.], cksum 0xddc8 (correct), ack 2523606912, win 130, options [nop,nop,TS val 76361882 ecr 762723522], length 0

08:31:31.708994 IP (tos 0x0, ttl 64, id 50939, offset 0, flags [DF], proto TCP (6), length 52)

vmhost1.gogetit.tplane.com.56635 > centsvr.gogetit.tplane.com.nfs: Flags [F.], cksum 0xddc7 (correct), seq 0, ack 1, win 130, options [nop,nop,TS val 76361882 ecr 762723522], length 0

08:31:31.709738 IP (tos 0x0, ttl 64, id 50941, offset 0, flags [DF], proto TCP (6), length 52)

vmhost1.gogetit.tplane.com.56635 > centsvr.gogetit.tplane.com.nfs: Flags [.], cksum 0xddc5 (correct), ack 2, win 130, options [nop,nop,TS val 76361882 ecr 762723523], length 0

output from /var/log/vmkernel.log

2014-09-05T14:40:54.771Z cpu6:9042)NFS: 157: Command: (mount) Server: (192.168.31.25) IP: (192.168.31.25) Path: (/vmdata) Label: (NFS01) Options: (None)

2014-09-05T14:40:54.771Z cpu6:9042)StorageApdHandler: 692: APD Handle 4a9fd433-9baa8219 Created with lock[StorageApd0x41001a]

2014-09-05T14:41:25.707Z cpu2:9042)StorageApdHandler: 739: Freeing APD Handle [4a9fd433-9baa8219]

2014-09-05T14:41:25.707Z cpu2:9042)StorageApdHandler: 802: APD Handle freed!

2014-09-05T14:41:25.707Z cpu2:9042)NFS: 168: NFS mount 192.168.31.25:/vmdata failed: Unable to connect to NFS server.

As you can see from the log and tcpdump I have very little to go on. I have been trying to generate some diff/new error but can't.

Any help would be appreciate!

Thank you.

tplane · ‎09-05-2014

update, just making sure I can ping from both vmkernal ports. Both vmkpings successful.

vmk0 = 192.168.31.15

vmk1 = 192.168.31.30

~ # vmkping -I vmk0 192.168.31.25

PING 192.168.31.25 (192.168.31.25): 56 data bytes

64 bytes from 192.168.31.25: icmp_seq=0 ttl=64 time=1.569 ms

64 bytes from 192.168.31.25: icmp_seq=1 ttl=64 time=0.359 ms

64 bytes from 192.168.31.25: icmp_seq=2 ttl=64 time=0.300 ms

# tcpdump -v -i eth0 src 192.168.31.15

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

10:55:54.699794 IP (tos 0x0, ttl 64, id 60878, offset 0, flags [none], proto ICMP (1), length 84)

vmhost1.gogetit.tplane.com > centsvr.gogetit.tplane.com: ICMP echo request, id 55227, seq 0, length 64

10:55:54.701049 ARP, Ethernet (len 6), IPv4 (len 4), Reply vmhost1.gogetit.tplane.com is-at 00:1e:0b:47:fc:d2 (oui Unknown), length 46

10:55:55.701855 IP (tos 0x0, ttl 64, id 60881, offset 0, flags [none], proto ICMP (1), length 84)

vmhost1.gogetit.tplane.com > centsvr.gogetit.tplane.com: ICMP echo request, id 55227, seq 1, length 64

10:55:56.703901 IP (tos 0x0, ttl 64, id 60883, offset 0, flags [none], proto ICMP (1), length 84)

vmhost1.gogetit.tplane.com > centsvr.gogetit.tplane.com: ICMP echo request, id 55227, seq 2, length 64

~ # vmkping -I vmk1 192.168.31.25

PING 192.168.31.25 (192.168.31.25): 56 data bytes

64 bytes from 192.168.31.25: icmp_seq=0 ttl=64 time=0.780 ms

64 bytes from 192.168.31.25: icmp_seq=1 ttl=64 time=0.388 ms

64 bytes from 192.168.31.25: icmp_seq=2 ttl=64 time=0.444 ms

# tcpdump -v -i eth0 src 192.168.31.30

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

10:54:58.417596 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has centsvr.gogetit.tplane.com tell 192.168.31.30, length 46

10:54:58.417848 IP (tos 0x0, ttl 64, id 60789, offset 0, flags [none], proto ICMP (1), length 84)

192.168.31.30 > centsvr.gogetit.tplane.com: ICMP echo request, id 47547, seq 0, length 64

10:54:59.419053 IP (tos 0x0, ttl 64, id 60791, offset 0, flags [none], proto ICMP (1), length 84)

192.168.31.30 > centsvr.gogetit.tplane.com: ICMP echo request, id 47547, seq 1, length 64

10:55:00.420371 IP (tos 0x0, ttl 64, id 60793, offset 0, flags [none], proto ICMP (1), length 84)

192.168.31.30 > centsvr.gogetit.tplane.com: ICMP echo request, id 47547, seq 2, length 64

10:55:03.417984 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.31.30 is-at 00:50:56:6d:2a:e5 (oui Unknown), length 46

JPM300 · ‎09-05-2014

It looks like you have covered most of this, but figured I would post it anyways:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100735...

There was something else you had to do with linux NFS shares, I forget what it was, I resloved it for someone else on the forums before let me see if I can dig it up

http://emcsan.wordpress.com/2011/09/01/vmware-cant-write-to-a-celerra-rw-nfs-mounted-datastore/

tplane · ‎09-05-2014

JPM300 wrote:

It looks like you have covered most of this, but figured I would post it anyways:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100735...

There was something else you had to do with linux NFS shares, I forget what it was, I resloved it for someone else on the forums before let me see if I can dig it up

http://emcsan.wordpress.com/2011/09/01/vmware-cant-write-to-a-celerra-rw-nfs-mounted-datastore/

I have run through 1007352 and feel that I have cfg all of that. Also been running through this KB - VMware KB: Troubleshooting connectivity issues to an NFS datastore on ESX and ESXi hosts

Still no new error.

I will take a look at the NFS link and see if I missed anything.

Thanks for the help.

Any more ideas?

JPM300 · ‎09-05-2014

It seems to me that its a permission issues or an ACL somewhere. I had a simular issue with a Windows Server 2008 R2 box that got upgraded from Server 2008 R2 Standard to Server 2008 R2 Enterprise. Once we ugpraded the key took, but it didn't really take, as we set it in the gui and rebooted. We where just UNABLE to connect the NFS share in Vmware again. Like yourself we kept getting the standard unable to connect to NFS share. We found some article stating that the Windows GUI doesn't always SET the key properly, so we re-set it through the command line tool and BOOM the NFS share instantly connected. I also helped troubleshoot some linux issues and it was always a permission / ACL issue in the end. Keep your focus there and see what you can dig up. However with that said it does look like you have everything setup properly.

If there is a way to open up a share to annoymous, aka Everyone Full Permission type of deal, try that on centOS and see if it makes a difference.

Let us know,

tplane · ‎09-05-2014

JPM300 wrote:

It seems to me that its a permission issues or an ACL somewhere. I had a simular issue with a Windows Server 2008 R2 box that got upgraded from Server 2008 R2 Standard to Server 2008 R2 Enterprise. Once we ugpraded the key took, but it didn't really take, as we set it in the gui and rebooted. We where just UNABLE to connect the NFS share in Vmware again. Like yourself we kept getting the standard unable to connect to NFS share. We found some article stating that the Windows GUI doesn't always SET the key properly, so we re-set it through the command line tool and BOOM the NFS share instantly connected. I also helped troubleshoot some linux issues and it was always a permission / ACL issue in the end. Keep your focus there and see what you can dig up. However with that said it does look like you have everything setup properly.

If there is a way to open up a share to annoymous, aka Everyone Full Permission type of deal, try that on centOS and see if it makes a difference.

Let us know,

I have reviewed the previous mentioned NFS link and I have already done everything mentioned in the article. i.e

# cat /etc/exports

/vmdata/ 192.168.31.30(rw,sync,no_root_squash,all_squash) 192.168.31.15(rw,sync,no_root_squash,all_squash) 192.168.31.20(rw,sync,no_root_squash,all_squash)

my exports file covers vmkernal IPs, root access (no_root_squash), anonymous access (all_squash)

also gave everyone rw access to share dir (/vmdata) with chmod 1777 for testing. no joy.

I can also mount and rw to the share just fine from another CentOS as mentioned.

Thanks JPM300.

.

JPM300 · ‎09-05-2014

Hmmm that is strange....... just wondering what version of NFS are you using on CentOS ESXi 5.1 can only do NFS 3, is it possible your version of centerOS is using NFS4?

tplane · ‎09-05-2014

JPM300 wrote:

Hmmm that is strange....... just wondering what version of NFS are you using on CentOS ESXi 5.1 can only do NFS 3, is it possible your version of centerOS is using NFS4?

I believe I'm using NFS4 but haven't gone an error saying this is an issue although I knew ESXi only supported NFS3. Would NFS4 be backward compatible to support ESXi or do I need to uninstall NFS4 and install NFS3?

tplane · ‎09-05-2014

I used this article to downgrade NFS 4 to 3 but with no joy.

https://www.ibm.com/developerworks/community/blogs/nfrsblog/entry/cc_steps_for_downgrading_the_nfs_v...

I still get the same old error.

I have seen errors in other posting that generate different errors if you're at using the wrong version of NFS but I don't get those errors.

I get the same old error.

more help please...

jlanders · ‎09-05-2014

I'm not quite sure why the VMkernel-NFS portgroup was created since it is on the same network as the management network and uses the same uplink. This is why the management IP sees traffic instead of VMkernel-NFS portgroup.

Anyway, you may want to check the CentOS NFS server and ensure that the rpc.mountd (892) is open on the iptables firewall.

Generally, opening ports 111 (SunRPC - TCP & UDP), 662 (statd - TCP & UDP), 875 (rquotad TCP & UDP), 892 (mountd TCP & UDP), 2049 (nfsd TCP), 32803 (lockd - TCP), 32769 (lockd - UDP) in iptables on CentOS offers the best chance for success.

tplane · ‎09-05-2014

jlanders wrote:

I'm not quite sure why the VMkernel-NFS portgroup was created since it is on the same network as the management network and uses the same uplink. This is why the management IP sees traffic instead of VMkernel-NFS portgroup.

Anyway, you may want to check the CentOS NFS server and ensure that the rpc.mountd (892) is open on the iptables firewall.

Generally, opening ports 111 (SunRPC - TCP & UDP), 662 (statd - TCP & UDP), 875 (rquotad TCP & UDP), 892 (mountd TCP & UDP), 2049 (nfsd TCP), 32803 (lockd - TCP), 32769 (lockd - UDP) in iptables on CentOS offers the best chance for success.

Thanks I will review iptables.

I created the vmkernal port for NFS traffic as some of the posting had recommended it for similar situation but I also read it creates ti own problem with multiple ports on the same subnet. So I'll review that again.

firewall rules have me thinking so I'll work on that first.

Thanks.

tplane · ‎09-05-2014

tplane wrote:

jlanders wrote:

I'm not quite sure why the VMkernel-NFS portgroup was created since it is on the same network as the management network and uses the same uplink. This is why the management IP sees traffic instead of VMkernel-NFS portgroup.

Anyway, you may want to check the CentOS NFS server and ensure that the rpc.mountd (892) is open on the iptables firewall.

Generally, opening ports 111 (SunRPC - TCP & UDP), 662 (statd - TCP & UDP), 875 (rquotad TCP & UDP), 892 (mountd TCP & UDP), 2049 (nfsd TCP), 32803 (lockd - TCP), 32769 (lockd - UDP) in iptables on CentOS offers the best chance for success.

Thanks I will review iptables.

I created the vmkernal port for NFS traffic as some of the posting had recommended it for similar situation but I also read it creates ti own problem with multiple ports on the same subnet. So I'll review that again.

firewall rules have me thinking so I'll work on that first.

Thanks.

no joy!

I updated iptables, add 662, 875, 892,32803, 32769. However netstat shows that most of the ports aren't listening, only 111, 2049 and 875 are listening . nc -z 192.168.31.25 port# - to verify I can reach the ports from ESXi to NFS server.

Also remove the 2nd vmkernal port so now only have the mgnt port with ip of 192.168.31.15.

and... still got the same old error, and log entry, and tcpdump.

frustrating....

Thank you.

any more thoughts please.

jlanders · ‎09-05-2014

875 is rpc.mountd, which you'll need in this setup.

Have you removed the comment (i.e. #) characters in front of the all of the _PORT= lines in /etc/sysconfig/nfs on your CentOS NFS server?

On your CentOS NFS server, what does the 'service nfs restart' command show?

tplane · ‎09-06-2014

jlanders wrote:

875 is rpc.mountd, which you'll need in this setup.

Have you removed the comment (i.e. #) characters in front of the all of the _PORT= lines in /etc/sysconfig/nfs on your CentOS NFS server?

On your CentOS NFS server, what does the 'service nfs restart' command show?

output:

# service nfs restart

Shutting down NFS daemon: [ OK ]

Shutting down NFS mountd: [ OK ]

Shutting down NFS quotas: [ OK ]

Shutting down NFS services: [ OK ]

Shutting down RPC idmapd: [ OK ]

Starting NFS services: [ OK ]

Starting NFS quotas: [ OK ]

Starting NFS mountd: [ OK ]

Starting NFS daemon: [ OK ]

Starting RPC idmapd: [ OK ]

Bingo.....it mounted!

uncommenting ports for :

# Port rpc.mountd should listen on.

MOUNTD_PORT=892

# Port rpc.statd should listen on.

STATD_PORT=662

# TCP port rpc.lockd should listen on.

LOCKD_TCPPORT=32803

# UDP port rpc.lockd should listen on.

LOCKD_UDPPORT=32769

I will do some more testing to verify rw functionality but at least its mounted now.

Thank you for the help!

JPM300 · ‎09-06-2014

Glad you got it working

Yeah I wouldn't of thought about the # symbol problem

tplane · ‎09-07-2014

To finish up, I had a slight permission for all the work/testing over the past couple of days but all seems to good now.

Cleaned up exports file:

]# cat /etc/exports

/vmdata/ 192.168.31.15(rw,sync,no_root_squash,no_all_squash)

Thanks again to all for the help!

All

unable to mount NFS share on ESXi 5.1