I have a Intel 10 Gigabit AF DA Dual Port Server Adapter that I recently installed on ESX 4.0 host. I am trying to setup VMDq on the sever so that it is optimized for jumbo frames. I found a document on how to do this in ESX 3.5 but nothing for ESX 4.0. Any ideas where I can find a updated document for ESX 4.0?
Thanks,
Eric
Hi,
I have the same question. I'm currently playing with a Dell PowerEdge R710 that has 2x Intel Gigabit ET quad port NICs (Intel 82576 controller) installed and I didn't find the trick to enable VMDq on ESX 4.0 and ESXi 4.0. I've set the options as described in the document for ESX 3.5 (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009010) but nothing happens. When I run ethtool -S vmnic* I can see only one RX-queue.
Is there anybody outside who is running VMDq on ESX 4.0?
regards, Simon
I am having an issue with the same R710, dual Intel AF DA adapter deployments and I am troubleshooting with Intel, VMWare, Dell and Cisco (Nexus upstream for 10G switching). I am wondering if VMDQ (netqueue) could be my issue. I am seeing instabilty and NICs going offline and I am wondering if VMDQ has support issues like TOE/TSO did in ESX 3.5. Anyone else sucessfully running an R710 cluster with Intel AF DA cards? There were some known issues with the ixgbe 10G intel driver in late 2009 that VUM patched for NIC teaming with 10G that could be the issue as well.
Thanks,
Kurt
NetQueue and VMDq are enabled by default in 4.x so you don't
need to enable either of them. Just make sure that you have one of the
following network controllers on your Intel(R) Ethernet 10GbE Server Adapter.
10GbE
Intel® Ethernet Controller 82598
Intel® Ethernet Controller 82599
To verify that VMDq has been successfully enabled:
Verify NetQueue has been enabled: # cat /etc/vmware/esx.conf
Confirm the following line has been added into the file: /vmkernel/netNetqueueEnabled = "TRUE"
Verify the options configured for the ixgbe module: # esxcfg-module -g ixgbe
The output is similar to: ixgbe enabled = 1 options = 'InterruptType=2,2 VMDQ=16,16'
You will see VMDQ=16,16 on a dual port 82598 and dual port 82599.
If VMDQ does not seem to be enabled make sure you have the
latest ixgbe driver and check this kb for more information – kb 1004278.
While the following Intel Ethernet Controllers do support
VMDq in hardware, currently ESX 4 support is being developed and may be added
at a future date.
1GbE
Intel® Ethernet Controller 82575
Intel® Ethernet Controller 82576
Intel® Ethernet Controller 82580
For more details check out the Intel websites -
Message was edited by: TheHevy
I corrected a mistake on the number queues that will show up on the dual port 82599. While the silicone supports 64 per port only 16 are used.
<![endif]><![if gte mso 9]>
<!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;
mso-font-charset:1;
mso-generic-font-family:roman;
mso-font-format:other;
mso-font-pitch:variable;
mso-font-signature:0 0 0 0 0 0;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;
mso-font-charset:0;
mso-generic-font-family:swiss;
mso-font-pitch:variable;
mso-font-signature:-1610611985 1073750139 0 0 159 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-parent:"";
margin-top:0in;
margin-right:0in;
margin-bottom:10.0pt;
margin-left:0in;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
p
{mso-style-noshow:yes;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:"Times New Roman","serif";
mso-fareast-font-family:"Times New Roman";}
.MsoChpDefault
{mso-style-type:export-only;
mso-default-props:yes;
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
.MsoPapDefault
{mso-style-type:export-only;
margin-bottom:10.0pt;
line-height:115%;}
@page Section1
div.Section1
-->
The Intel AF DA 10G adapters are 82598 controllers and I have one of the build
outs testing the "beta" ixgbe driver for it that is not released to
VUM as of yet. This has yielded different types of instability like
VMotion causing the adapter to go down. I am wondering if the best
practices around Vswitching I worked with VMWare engineers on are what is causing
the issue. The R710s have 2 of these cards and thus 4 10G ports.
ESX 4.0/4.0 U1 support up to 4 x 10G ports and 2 adapters per the VMware
Configuration Maximums guide. However, this is truly pushing the
line. The only reason I went with the this design was due to the cheap
cost of Intel AF DA 10G cards with native SFP+ for Cisco Nexus. You would
need two for redundant fabrics so why not use both ports (One of each card for
each of the two Vswitches, LAN/VMotion & iSCSI SAN).
Below is a link to an older Cisco Nexus/VMware 10G white paper where they
outline only using one Vswitch. I know for a fact that each of the
deployments that is having issues could sustain all their traffic on half of
one adapter, but it would be a waste of 2 ports in my mind. I also notice
that they recommend having the kernel and mgmt ports be Active/Passive, while
the "LAN" port group is Active/Active. I thought this strange
last year when I read this so I took it to VMWare and they said that having two
Vswitches with redundant 10G VMNICs using Originating Port ID load balancing
would be fine (the port groups inherit settings from the Vswitch). It is
much simpler and would seem to be better.... but now I have a simple setup that
is having major network instability issues across multiple projects.
I am fairbly certain that my issues are ixgbe driver or Intel adapter specific, but I am not going to rule out anything on the configuration level. I have done health checks for folks where TOE/TSO was a major issue in ESX 3.x so that is one thing I want to look at next even though Vsphere supposedly supports it.
Any thoghts?
http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-496511.pdf
In ESX 4.0, the 10g driver comes up in VMDq mode automatically - you don't have to turn it on like in ESX 3.5.
For jumbo frames, just enable it on the vswitch (e.g. esxcfg-vswitch -m 9000 vSwitch2) and in the VM itself (e.g. ifconfig eth1 mtu 9000).
sln
We have the Intel X520A dual port 10G cards which uses the Intel 82599EB controller in a ESX 4.0 U1 host. After I install the latest ixgbe driver, by default the ixgbe module is enabled, but the VMDQ settings are blank i.e.:
ixgbe enabled = 1 options = ''
You mention that I should see" VMDQ=64,64" in the option output when using the 82599 controller, but the VMware KB article, 1004278, which seems mostly geared to the 82588 controller says: A value for VMDQ must exist to indicate the number of receive queues. A value of 16 for VMDQ sets the number of receive queues to the maximum. The Intel 82598 10 Gigabit Ethernet Controller provides 32 transmit queues and 64 receive queues per port, which can be mapped to a maximum of 16 processor cores. The range of values for the VMDQ parameter is 1 to 16.
It looks to me from the intel docs that you are right and there is a maximum of 64 VMDqs per port on the 82599.
I guess what i'm asking is, what is the maximum number of VMDqs I should expect to get with an 82599 controller and how do I configure VMware to see/use that max? should i just follow the KB to the letter and set the VMDQ to 16,16? or should i expect more from the 82599?
I corrected the mistake in my post. While the 82599 supports 64 queues per port VMware only uses 16.
Sorry for the confusion.
no worries, thanks for your post it was very helpful
@kurt
I have serious issues with individual vms dropping off the network randomly with the x520 and previously with the intel AF DA cards. I found a work around to disable netqueue and msi-x support by changing the ixgbe module options which has made the systems much more reliable.
I still see some odd occasional watchdog reset messages on the ixgbe driver in the vmkernel log from time to time though.
I have a case open with vmware support regarding the network disconnects of guests and they are aware that the netqueue / msi-x disable option seems to have cleared up the problem on my end. (essentially running esxcfg-module -s "InterruptType=0,0 VMDQ=0,0 MQ=0,0 RSS=0,0" ixgbe )
As for Netqueue and jumbo frames, i read some where that you are supposed to modify the two settings that was mentioned in this kb article about netqueue
I just got word that the Intel and VMware engineering teams are looking in to this issue. I will see what I can find out and post an update as soon as I have more information.
Thanks TheHevy! I've been working with vmware support on this trying to track it down, its just slow going so far. I figured it might help other folks out by posting what worked for me to keep my production vms on the network. Any light you can shine on this will be appreciated!
What version of the ixgbe driver are you using? Are you using the ESX 4.0 inbox driver or an updated driver?
I have tried this with both the 2.0.44.14.4 and the 2.0.38.2.3 drivers posted on the vmware driver download page. the 1.x driver that comes on the ESX install media doesn't recognize the x520 card at all. I believe both of these drivers are async.
<![endif]><![if gte mso 9]>
<!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;
mso-font-charset:1;
mso-generic-font-family:roman;
mso-font-format:other;
mso-font-pitch:variable;
mso-font-signature:0 0 0 0 0 0;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;
mso-font-charset:0;
mso-generic-font-family:swiss;
mso-font-pitch:variable;
mso-font-signature:-1610611985 1073750139 0 0 159 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-parent:"";
margin-top:0in;
margin-right:0in;
margin-bottom:10.0pt;
margin-left:0in;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
p
{mso-style-noshow:yes;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:"Times New Roman","serif";
mso-fareast-font-family:"Times New Roman";}
.MsoChpDefault
{mso-style-type:export-only;
mso-default-props:yes;
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;}
.MsoPapDefault
{mso-style-type:export-only;
margin-bottom:10.0pt;
line-height:115%;}
@page Section1
div.Section1
-->
I just had two of my clients implement VPC and IP hashing on their
Nesux/Vswitches and it appears to have resolved the issue for at least one of
them and the other we will know soon. I have been working with Intel and
Cisco engineers and both have stressed how important it is to use VPC with
Nexus switches. I am going to draw further conclusions from this long
drawn out troubleshooting and say that it is best practice to always use IP
hashing whether it be Etherchannel on 1G switches or VPC on 10G. The configuration
is much more laborious and difficult for some to understand with no switch experience,
but well worth it when you look at load balancing in esxtop vs. Originating
Port ID. Using VPC will imply a NX-OS upgrade for some who are running
older builds, but well worth it in the long run.
In case anyone is interested in a work around outside of the MSI-XVMDq turn
down.... Just move your VMotion to your Onboard 1G NICs off your 10G cards and
you should see the issue go away. Obviously not a long term fix, but it
will get you through the hard times until you can upgrade your NX-OS and
implement VPC.
Kurt
Just to follow up on this, it appears that the stock settings enable vmdq but the RX queues for me don't seem to get fully utilized. I see 8 RX and 8 TX queues by default with the x520 and all TX queues on all adaptors listed have packets on them, but the RX queues only 2 or 3 of them seem to get packets on each adaptor.
I've seen issues with the InteruptThrottleRate=1 option causing all kinds of problems BTW, so don't set that.
VPC with IP HASH also seems to be full of weirdness for my environment. Addtionally you can't use Beacon Probing with IP Hash turned on.
Be aware that esx40 doesn't assign rx queues to a VM until that VM has a high level of traffic for 5 or 10 seconds. Until then, the VM traffic will go through the default queue. When the traffic level goes back down for a minute or so, the VM will get reassigned to the default queue.
I'm not sure that explains why rx_queue_3_bytes or higher numbered queues never show any bytes or packets on any of the servers unless i set the ixgbe options for number of queues. in that case they all get used.
You should see no difference between no options and setting VMDQ=8,8. In either case, an easy to see the activity is to have one window into the console running a command like "watch -d 2 'ethtool -S vmnicX | grep packet'", then have 8 VMs running and start a netperf on each of them individually. You can also have another console window running "tail -f /var/log/vmkernel" and watch the ixgbe driver announce the queue assignments and removals. This might be helpful in seeing how ESX uses the NetQueue resources that ixgbe makes available.
I'm going to be rebooting my hosts soon to update to the latest ixgbe driver that was recently released on vmware.com and when I do I'll take a host and set the vmdq= mq= rss= and you'll see all the rx queues in use.
right now with the defaults i see from ethtool -S on each adaptor the following kind of data on all hosts.
tx_queue_0_packets: 94648571
tx_queue_0_bytes: 28586001747
tx_queue_1_packets: 149908577
tx_queue_1_bytes: 124968907338
tx_queue_2_packets: 80909523
tx_queue_2_bytes: 55412940672
tx_queue_3_packets: 233581804
tx_queue_3_bytes: 220851492967
tx_queue_4_packets: 103926318
tx_queue_4_bytes: 40701558988
tx_queue_5_packets: 79434552
tx_queue_5_bytes: 43394588594
tx_queue_6_packets: 38678487
tx_queue_6_bytes: 16926347834
tx_queue_7_packets: 4278749
tx_queue_7_bytes: 3652405607
rx_queue_0_packets: 597094334
rx_queue_0_bytes: 430552200810
rx_queue_1_packets: 125552738
rx_queue_1_bytes: 141803068945
rx_queue_2_packets: 5011479
rx_queue_2_bytes: 4618800561
rx_queue_3_packets: 0
rx_queue_3_bytes: 0
rx_queue_4_packets: 0
rx_queue_4_bytes: 0
rx_queue_5_packets: 0
rx_queue_5_bytes: 0
rx_queue_6_packets: 0
rx_queue_6_bytes: 0
rx_queue_7_packets: 0
rx_queue_7_bytes: 0