VMware Cloud Community
bookbinder
Contributor
Contributor

Setting up VMDq on ESX 4.0

I have a Intel 10 Gigabit AF DA Dual Port Server Adapter that I recently installed on ESX 4.0 host. I am trying to setup VMDq on the sever so that it is optimized for jumbo frames. I found a document on how to do this in ESX 3.5 but nothing for ESX 4.0. Any ideas where I can find a updated document for ESX 4.0?

Thanks,

Eric

Tags (4)
0 Kudos
21 Replies
VMentor
Contributor
Contributor

Hi,

I have the same question. I'm currently playing with a Dell PowerEdge R710 that has 2x Intel Gigabit ET quad port NICs (Intel 82576 controller) installed and I didn't find the trick to enable VMDq on ESX 4.0 and ESXi 4.0. I've set the options as described in the document for ESX 3.5 (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009010) but nothing happens. When I run ethtool -S vmnic* I can see only one RX-queue. Smiley Sad

Is there anybody outside who is running VMDq on ESX 4.0?

regards, Simon

0 Kudos
kurtbunker
Contributor
Contributor

I am having an issue with the same R710, dual Intel AF DA adapter deployments and I am troubleshooting with Intel, VMWare, Dell and Cisco (Nexus upstream for 10G switching). I am wondering if VMDQ (netqueue) could be my issue. I am seeing instabilty and NICs going offline and I am wondering if VMDQ has support issues like TOE/TSO did in ESX 3.5. Anyone else sucessfully running an R710 cluster with Intel AF DA cards? There were some known issues with the ixgbe 10G intel driver in late 2009 that VUM patched for NIC teaming with 10G that could be the issue as well.

Thanks,

Kurt

0 Kudos
TheHevy
Contributor
Contributor

NetQueue and VMDq are enabled by default in 4.x so you don't

need to enable either of them. Just make sure that you have one of the

following network controllers on your Intel(R) Ethernet 10GbE Server Adapter.

10GbE

Intel® Ethernet Controller 82598

Intel® Ethernet Controller 82599

8829_8829.png

To verify that VMDq has been successfully enabled:

Verify NetQueue has been enabled: # cat /etc/vmware/esx.conf

Confirm the following line has been added into the file: /vmkernel/netNetqueueEnabled = "TRUE"

Verify the options configured for the ixgbe module: # esxcfg-module -g ixgbe

The output is similar to: ixgbe enabled = 1 options = 'InterruptType=2,2 VMDQ=16,16'

You will see VMDQ=16,16 on a dual port 82598 and dual port 82599.

If VMDQ does not seem to be enabled make sure you have the

latest ixgbe driver and check this kb for more information – kb 1004278.

While the following Intel Ethernet Controllers do support

VMDq in hardware, currently ESX 4 support is being developed and may be added

at a future date.

1GbE

Intel® Ethernet Controller 82575

Intel® Ethernet Controller 82576

Intel® Ethernet Controller 82580

For more details check out the Intel websites -

Controllers:

Adapters:

Message was edited by: TheHevy

I corrected a mistake on the number queues that will show up on the dual port 82599. While the silicone supports 64 per port only 16 are used.

0 Kudos
kurtbunker
Contributor
Contributor

<![endif]><![if gte mso 9]>

<!--

/* Font Definitions */

@font-face

{font-family:"Cambria Math";

panose-1:2 4 5 3 5 4 6 3 2 4;

mso-font-charset:1;

mso-generic-font-family:roman;

mso-font-format:other;

mso-font-pitch:variable;

mso-font-signature:0 0 0 0 0 0;}

@font-face

{font-family:Calibri;

panose-1:2 15 5 2 2 2 4 3 2 4;

mso-font-charset:0;

mso-generic-font-family:swiss;

mso-font-pitch:variable;

mso-font-signature:-1610611985 1073750139 0 0 159 0;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

{mso-style-unhide:no;

mso-style-qformat:yes;

mso-style-parent:"";

margin-top:0in;

margin-right:0in;

margin-bottom:10.0pt;

margin-left:0in;

line-height:115%;

mso-pagination:widow-orphan;

font-size:11.0pt;

font-family:"Calibri","sans-serif";

mso-ascii-font-family:Calibri;

mso-ascii-theme-font:minor-latin;

mso-fareast-font-family:Calibri;

mso-fareast-theme-font:minor-latin;

mso-hansi-font-family:Calibri;

mso-hansi-theme-font:minor-latin;

mso-bidi-font-family:"Times New Roman";

mso-bidi-theme-font:minor-bidi;}

p

{mso-style-noshow:yes;

mso-style-priority:99;

mso-margin-top-alt:auto;

margin-right:0in;

mso-margin-bottom-alt:auto;

margin-left:0in;

mso-pagination:widow-orphan;

font-size:12.0pt;

font-family:"Times New Roman","serif";

mso-fareast-font-family:"Times New Roman";}

.MsoChpDefault

{mso-style-type:export-only;

mso-default-props:yes;

mso-ascii-font-family:Calibri;

mso-ascii-theme-font:minor-latin;

mso-fareast-font-family:Calibri;

mso-fareast-theme-font:minor-latin;

mso-hansi-font-family:Calibri;

mso-hansi-theme-font:minor-latin;

mso-bidi-font-family:"Times New Roman";

mso-bidi-theme-font:minor-bidi;}

.MsoPapDefault

{mso-style-type:export-only;

margin-bottom:10.0pt;

line-height:115%;}

@page Section1

div.Section1

-->

The Intel AF DA 10G adapters are 82598 controllers and I have one of the build

outs testing the "beta" ixgbe driver for it that is not released to

VUM as of yet. This has yielded different types of instability like

VMotion causing the adapter to go down. I am wondering if the best

practices around Vswitching I worked with VMWare engineers on are what is causing

the issue. The R710s have 2 of these cards and thus 4 10G ports.

ESX 4.0/4.0 U1 support up to 4 x 10G ports and 2 adapters per the VMware

Configuration Maximums guide. However, this is truly pushing the

line. The only reason I went with the this design was due to the cheap

cost of Intel AF DA 10G cards with native SFP+ for Cisco Nexus. You would

need two for redundant fabrics so why not use both ports (One of each card for

each of the two Vswitches, LAN/VMotion & iSCSI SAN).

Below is a link to an older Cisco Nexus/VMware 10G white paper where they

outline only using one Vswitch. I know for a fact that each of the

deployments that is having issues could sustain all their traffic on half of

one adapter, but it would be a waste of 2 ports in my mind. I also notice

that they recommend having the kernel and mgmt ports be Active/Passive, while

the "LAN" port group is Active/Active. I thought this strange

last year when I read this so I took it to VMWare and they said that having two

Vswitches with redundant 10G VMNICs using Originating Port ID load balancing

would be fine (the port groups inherit settings from the Vswitch). It is

much simpler and would seem to be better.... but now I have a simple setup that

is having major network instability issues across multiple projects.

I am fairbly certain that my issues are ixgbe driver or Intel adapter specific, but I am not going to rule out anything on the configuration level. I have done health checks for folks where TOE/TSO was a major issue in ESX 3.x so that is one thing I want to look at next even though Vsphere supposedly supports it.

Any thoghts?

http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-496511.pdf

0 Kudos
emusln
Contributor
Contributor

In ESX 4.0, the 10g driver comes up in VMDq mode automatically - you don't have to turn it on like in ESX 3.5.

For jumbo frames, just enable it on the vswitch (e.g. esxcfg-vswitch -m 9000 vSwitch2) and in the VM itself (e.g. ifconfig eth1 mtu 9000).

sln

0 Kudos
tPops
Contributor
Contributor

We have the Intel X520A dual port 10G cards which uses the Intel 82599EB controller in a ESX 4.0 U1 host. After I install the latest ixgbe driver, by default the ixgbe module is enabled, but the VMDQ settings are blank i.e.:

# esxcfg-module -g ixgbe

ixgbe enabled = 1 options = ''

You mention that I should see" VMDQ=64,64" in the option output when using the 82599 controller, but the VMware KB article, 1004278, which seems mostly geared to the 82588 controller says: A value for VMDQ must exist to indicate the number of receive queues. A value of 16 for VMDQ sets the number of receive queues to the maximum. The Intel 82598 10 Gigabit Ethernet Controller provides 32 transmit queues and 64 receive queues per port, which can be mapped to a maximum of 16 processor cores. The range of values for the VMDQ parameter is 1 to 16.

It looks to me from the intel docs that you are right and there is a maximum of 64 VMDqs per port on the 82599.

I guess what i'm asking is, what is the maximum number of VMDqs I should expect to get with an 82599 controller and how do I configure VMware to see/use that max? should i just follow the KB to the letter and set the VMDQ to 16,16? or should i expect more from the 82599?

0 Kudos
TheHevy
Contributor
Contributor

I corrected the mistake in my post. While the 82599 supports 64 queues per port VMware only uses 16.

Sorry for the confusion.

0 Kudos
tPops
Contributor
Contributor

no worries, thanks for your post it was very helpful

0 Kudos
judbarron
Contributor
Contributor

@kurt

I have serious issues with individual vms dropping off the network randomly with the x520 and previously with the intel AF DA cards. I found a work around to disable netqueue and msi-x support by changing the ixgbe module options which has made the systems much more reliable.

I still see some odd occasional watchdog reset messages on the ixgbe driver in the vmkernel log from time to time though.

I have a case open with vmware support regarding the network disconnects of guests and they are aware that the netqueue / msi-x disable option seems to have cleared up the problem on my end. (essentially running esxcfg-module -s "InterruptType=0,0 VMDQ=0,0 MQ=0,0 RSS=0,0" ixgbe )

As for Netqueue and jumbo frames, i read some where that you are supposed to modify the two settings that was mentioned in this kb article about netqueue

0 Kudos
TheHevy
Contributor
Contributor

I just got word that the Intel and VMware engineering teams are looking in to this issue. I will see what I can find out and post an update as soon as I have more information.

0 Kudos
judbarron
Contributor
Contributor

Thanks TheHevy! I've been working with vmware support on this trying to track it down, its just slow going so far. I figured it might help other folks out by posting what worked for me to keep my production vms on the network. Any light you can shine on this will be appreciated!

0 Kudos
emusln
Contributor
Contributor

What version of the ixgbe driver are you using? Are you using the ESX 4.0 inbox driver or an updated driver?

0 Kudos
judbarron
Contributor
Contributor

I have tried this with both the 2.0.44.14.4 and the 2.0.38.2.3 drivers posted on the vmware driver download page. the 1.x driver that comes on the ESX install media doesn't recognize the x520 card at all. I believe both of these drivers are async.

0 Kudos
kurtbunker
Contributor
Contributor

<![endif]><![if gte mso 9]>

<!--

/* Font Definitions */

@font-face

{font-family:"Cambria Math";

panose-1:2 4 5 3 5 4 6 3 2 4;

mso-font-charset:1;

mso-generic-font-family:roman;

mso-font-format:other;

mso-font-pitch:variable;

mso-font-signature:0 0 0 0 0 0;}

@font-face

{font-family:Calibri;

panose-1:2 15 5 2 2 2 4 3 2 4;

mso-font-charset:0;

mso-generic-font-family:swiss;

mso-font-pitch:variable;

mso-font-signature:-1610611985 1073750139 0 0 159 0;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

{mso-style-unhide:no;

mso-style-qformat:yes;

mso-style-parent:"";

margin-top:0in;

margin-right:0in;

margin-bottom:10.0pt;

margin-left:0in;

line-height:115%;

mso-pagination:widow-orphan;

font-size:11.0pt;

font-family:"Calibri","sans-serif";

mso-ascii-font-family:Calibri;

mso-ascii-theme-font:minor-latin;

mso-fareast-font-family:Calibri;

mso-fareast-theme-font:minor-latin;

mso-hansi-font-family:Calibri;

mso-hansi-theme-font:minor-latin;

mso-bidi-font-family:"Times New Roman";

mso-bidi-theme-font:minor-bidi;}

p

{mso-style-noshow:yes;

mso-style-priority:99;

mso-margin-top-alt:auto;

margin-right:0in;

mso-margin-bottom-alt:auto;

margin-left:0in;

mso-pagination:widow-orphan;

font-size:12.0pt;

font-family:"Times New Roman","serif";

mso-fareast-font-family:"Times New Roman";}

.MsoChpDefault

{mso-style-type:export-only;

mso-default-props:yes;

mso-ascii-font-family:Calibri;

mso-ascii-theme-font:minor-latin;

mso-fareast-font-family:Calibri;

mso-fareast-theme-font:minor-latin;

mso-hansi-font-family:Calibri;

mso-hansi-theme-font:minor-latin;

mso-bidi-font-family:"Times New Roman";

mso-bidi-theme-font:minor-bidi;}

.MsoPapDefault

{mso-style-type:export-only;

margin-bottom:10.0pt;

line-height:115%;}

@page Section1

div.Section1

-->

I just had two of my clients implement VPC and IP hashing on their

Nesux/Vswitches and it appears to have resolved the issue for at least one of

them and the other we will know soon. I have been working with Intel and

Cisco engineers and both have stressed how important it is to use VPC with

Nexus switches. I am going to draw further conclusions from this long

drawn out troubleshooting and say that it is best practice to always use IP

hashing whether it be Etherchannel on 1G switches or VPC on 10G. The configuration

is much more laborious and difficult for some to understand with no switch experience,

but well worth it when you look at load balancing in esxtop vs. Originating

Port ID. Using VPC will imply a NX-OS upgrade for some who are running

older builds, but well worth it in the long run.

In case anyone is interested in a work around outside of the MSI-XVMDq turn

down.... Just move your VMotion to your Onboard 1G NICs off your 10G cards and

you should see the issue go away. Obviously not a long term fix, but it

will get you through the hard times until you can upgrade your NX-OS and

implement VPC.

Kurt

0 Kudos
judbarron
Contributor
Contributor

Just to follow up on this, it appears that the stock settings enable vmdq but the RX queues for me don't seem to get fully utilized. I see 8 RX and 8 TX queues by default with the x520 and all TX queues on all adaptors listed have packets on them, but the RX queues only 2 or 3 of them seem to get packets on each adaptor.

I've seen issues with the InteruptThrottleRate=1 option causing all kinds of problems BTW, so don't set that.

VPC with IP HASH also seems to be full of weirdness for my environment. Addtionally you can't use Beacon Probing with IP Hash turned on.

0 Kudos
emusln
Contributor
Contributor

Be aware that esx40 doesn't assign rx queues to a VM until that VM has a high level of traffic for 5 or 10 seconds. Until then, the VM traffic will go through the default queue. When the traffic level goes back down for a minute or so, the VM will get reassigned to the default queue.

0 Kudos
judbarron
Contributor
Contributor

I'm not sure that explains why rx_queue_3_bytes or higher numbered queues never show any bytes or packets on any of the servers unless i set the ixgbe options for number of queues. in that case they all get used.

0 Kudos
emusln
Contributor
Contributor

You should see no difference between no options and setting VMDQ=8,8. In either case, an easy to see the activity is to have one window into the console running a command like "watch -d 2 'ethtool -S vmnicX | grep packet'", then have 8 VMs running and start a netperf on each of them individually. You can also have another console window running "tail -f /var/log/vmkernel" and watch the ixgbe driver announce the queue assignments and removals. This might be helpful in seeing how ESX uses the NetQueue resources that ixgbe makes available.

0 Kudos
judbarron
Contributor
Contributor

I'm going to be rebooting my hosts soon to update to the latest ixgbe driver that was recently released on vmware.com and when I do I'll take a host and set the vmdq= mq= rss= and you'll see all the rx queues in use.

right now with the defaults i see from ethtool -S on each adaptor the following kind of data on all hosts.

tx_queue_0_packets: 94648571

tx_queue_0_bytes: 28586001747

tx_queue_1_packets: 149908577

tx_queue_1_bytes: 124968907338

tx_queue_2_packets: 80909523

tx_queue_2_bytes: 55412940672

tx_queue_3_packets: 233581804

tx_queue_3_bytes: 220851492967

tx_queue_4_packets: 103926318

tx_queue_4_bytes: 40701558988

tx_queue_5_packets: 79434552

tx_queue_5_bytes: 43394588594

tx_queue_6_packets: 38678487

tx_queue_6_bytes: 16926347834

tx_queue_7_packets: 4278749

tx_queue_7_bytes: 3652405607

rx_queue_0_packets: 597094334

rx_queue_0_bytes: 430552200810

rx_queue_1_packets: 125552738

rx_queue_1_bytes: 141803068945

rx_queue_2_packets: 5011479

rx_queue_2_bytes: 4618800561

rx_queue_3_packets: 0

rx_queue_3_bytes: 0

rx_queue_4_packets: 0

rx_queue_4_bytes: 0

rx_queue_5_packets: 0

rx_queue_5_bytes: 0

rx_queue_6_packets: 0

rx_queue_6_bytes: 0

rx_queue_7_packets: 0

rx_queue_7_bytes: 0

0 Kudos