VMware Cloud Community
wikusvanderwalt
Contributor
Contributor

ESXi5.1 Jumbo frames D-link DGS 1500 28

Hi All - I hope you can help as I'm a bit stuck with my network config...
I would like to find out what I'm doing wrong with my config.
Environment:
ESXi1 (5.1.0 914609)
ESXi2 (5.1.0 914609)
NAS QNAP TS-869 pro
I have 2 x ESXi servers in a DRS cluster and a QNAP NAS which exports iSCSI and NFS storage.
The physical connectivity looks like this:
NAS connected to DGS on Ports 3+4.  Static Link aggregation trunk (group1)
ESX1 connected to DGS on ports 5+6.  No link aggregation
ESX2 connected to DGS on ports 7+8. No link aggregation
1. D-link switch is configured with Jumbo frames enabled.
2. NAS is configured with Jumbo frames (9000) and teaming using balance XOR - This seems to be working fine.
3. ESX1 is configured with Jumbo frames on vswitch (9000) and teaming using originating virtual port ID. 1 x active and 1 x standby vnic.
4. ESX2 is configured with Jumbo frames on vswitch (9000) and teaming using originating virtual port ID. 1 x active and 1 x standby vnic.
5. Ports 3-8 are configured to tag vlan 100 and 3+4 are part of a link aggregation group1 for NAS team.
The esxi networking and NAS config use vlan 100 config and I am able to comunicate with the managment interfaces on NAS and VMkernels on esxi.  All esxi servers connect to NFS exports without a problem and I can ping and vmkping -s 9000 between esxi hosts and NAS.
This is a simple flat network setup and everything is on the same broadcast domain (vlan100) but I cannot get vmotion to work between esxi hosts - it starts and then stops at 14% until it times out.  Is there any specifics or detailed d-link documentation about a supported configuration that anyone can point me to as I ran into these problems because I intrduced the D-link switch and configured Jumbo frames?
Any help is much appreciated.
Regards,
Wikus
0 Kudos
11 Replies
rickardnobel
Champion
Champion

wikusvanderwalt wrote:

3. ESX1 is configured with Jumbo frames on vswitch (9000) and teaming using originating virtual port ID. 1 x active and 1 x standby vnic.

Have you set 9000 on the vmkernel interface as well as on the vSwitch?

I can ping and vmkping -s 9000 between esxi hosts and NAS.

That does not really prove that Jumbo Frames is working end-to-end, see this for which parameters to use on vmkping: http://rickardnobel.se/troubleshoot-jumbo-frames-with-vmkping

My VMware blog: www.rickardnobel.se
wikusvanderwalt
Contributor
Contributor

Hi Rickard,

Thanks for your reply.

I can confirm that I configured Jumbo frames (9000) on ESXi vswitch and vmk.  Same goes for NAS.

Thanks for the link.  I have done the following vmkping and all were successfull. i.e. I ran these commands from both esxi servers and all were successful:

vmkping -s 8972 -d esx1

vmkping -s 8972 -d esx2

vmkping -s 8972 -d nas

I think this implies that the Jumbo frames are not too long and there is end-to-end jumbo frames?

Wikus

0 Kudos
spravtek
Expert
Expert

0 Kudos
wikusvanderwalt
Contributor
Contributor

Hi Spravtek,

I saw that link as part of my search and tried a few settings on the 802.1p which relate to QOS. By default the feature was not enabled on the switch but I've since enabled it. Unfortunately its not made any difference.

Wikus

0 Kudos
rickardnobel
Champion
Champion

wikusvanderwalt wrote:

I have done the following vmkping and all were successfull. i.e. I ran these commands from both esxi servers and all were successful:

vmkping -s 8972 -d esx1

vmkping -s 8972 -d esx2

vmkping -s 8972 -d nas

I think this implies that the Jumbo frames are not too long and there is end-to-end jumbo frames?

Yes, this means that you have end-to-end connectivity with jumbo sized frames.

Could you post a screenshot of your networking configuration? (From the Networking tab in vSphere Client).

My VMware blog: www.rickardnobel.se
wikusvanderwalt
Contributor
Contributor

Hi Rickard - This is just my home lab so no worries about sharing IP addresses/VLANs

networking.png

0 Kudos
rickardnobel
Champion
Champion

Why are vmnic2 on both hosts on standby? They do not actually need to be configured so. If you have Port ID selected as NIC Teaming Policy they could both be active and your VM traffic will be spread over both. This would both increase performance and reduce complexity.

If you do not have any very specific reasons for this I would recommend you to put all interfaces as active.

My VMware blog: www.rickardnobel.se
0 Kudos
wikusvanderwalt
Contributor
Contributor

I selected it as I wanted to ensure that the teaming was not the problem. I can re-enable it no problem

Originally I created Link aggregation groups which trunked the vlans but ran into this vmotion issue. I've been working my way back since then.

I just called Dlink support about this and the guy mentioned that there could be some issues with a Safeguard feature on the switch - I've since disabled it. He also pointed me towards a site in Taiwan where I downloaded the switche's latest firmware.

He mentioned that the switch has a known issues with multicast traffic and that I need to look at that. I need to research what vmotion traffic consists of. Do you know if vmotion is unicast/multicast?

0 Kudos
rickardnobel
Champion
Champion

wikusvanderwalt wrote:

I selected it as I wanted to ensure that the teaming was not the problem.  I can re-enable it no problem

It would be good to enable it, since it might be something with the link being "up" from the switch side, but passive on the ESXi side which causes confusion. Let the physical switch ports be just plain ports, but with VLAN tagging enabled.

He also pointed me towards a site in Taiwan where I downloaded the switche's latest firmware.

It is often good to have to latest firmware, and it could be some kind of switch bug, but perhaps not likely. However, if you have the possibility to upgrade then do it.

He mentioned that the switch has a known issues with multicast traffic and that I need to look at that.  I need to research what vmotion traffic consists of.  Do you know if vmotion is unicast/multicast?

vMotion is unicast, so it should not be a problem in this case.

By the way, did the setup work originally before you enabled the Jumbo Frames setting?

My VMware blog: www.rickardnobel.se
0 Kudos
wikusvanderwalt
Contributor
Contributor

Hi Rickard

I've now made both nics active and upgraded the firmware. All the ports still have vlan 100 tagged as before.

Before this dlink switch I used a standard un-managed netgear switch so there was no option for trunking vlans. Everything worked fine when using the un-managed switch. I got this switch because it supports Jumbo frames, VLANS and some basic L3 functions that I wanted to use for setting up a vcloud lab.

http://communities.vmware.com/thread/418671?start=0&tstart=0

I've also recently started using update manager and upgraded to 5.1 (914609) in the last month. The upgrade happened before I got the new switch so dont know if this is related. The above link also references lots of people with similar issues. One post suggests it could be the NFS export permissions but strangely enough storage vmotion works! Does that mean anything to you?

0 Kudos
rickardnobel
Champion
Champion

wikusvanderwalt wrote:

The above link also references lots of people with similar issues.  One post suggests it could be the NFS export permissions but strangely enough storage vmotion works!  Does that mean anything to you?

The NFS export permissions must be set to Read/Write as well as "root squash", however without these your NFS should not work at all. If you is able to boot your VMs and write data to them then the NFS is likely fine.

vMotion could fail for several reasons, and there does seem to be some perhaps yet unsolved bugs with ESXi 5.1 around this. One suggestion in the link was to remove the vMotion checkbox for the vmkernel adapter and then enable it soon. You might try this on both hosts. (It "should" not help if everything is working as expected.)

My VMware blog: www.rickardnobel.se
0 Kudos