We installed 10 gig NICs on one of our ESX 4.0 u2 cluster with a total of 8 ESX hosts. After installing the 10 gig NICs and verifying settings, we started to migrate several VMs onto the ESX host. After about 20 minutes, we noticed other apps and server in our enviroment to start dropping connections, etc. We noticed (as shown in the attached graphs), that our network usage was off the charts.
After the issue was present for an hour or so, we migrated everything off that host and all apps went back to normal. We can't get very granular on 24 hour old data in Vcenter (Thanks VMware) -- Any ideas?
you have jumbo frames enabled on your vSwitch(s)? Can you post the output of
esxcfg-vswitch -l
and
esxcfg-nics -l
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch0 128 26 128 1500 vmnic12,vmnic2
PortGroup Name VLAN ID Used Ports Uplinks
SPY 0 0 vmnic12,vmnic2
DC-SERVERS 691 2 vmnic12,vmnic2
NETWORK 0 17 vmnic12,vmnic2
Service Console 680 1 vmnic2
ISCSI 696 1 vmnic12,vmnic2
VMOTION 683 1 vmnic12
NETAPP 681 1 vmnic12,vmnic2
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch1 64 4 64 1500 vmnic1
PortGroup Name VLAN ID Used Ports Uplinks
DMZ 0 2 vmnic1
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch2 64 3 64 1500 vmnic0
PortGroup Name VLAN ID Used Ports Uplinks
EDGE 0 1 vmnic0
and
esxcfg-nics -l
Name PCI Driver Link Speed Duplex MAC Address MTU Description
vmnic0 02:00.00 bnx2 Up 100Mbps Full 00:1a:64:36:df:2a 1500 Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T
vmnic1 02:00.01 bnx2 Up 100Mbps Full 00:1a:64:36:df:2c 1500 Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T
vmnic12 2a:00.00 ixgbe Up 10000Mbps Full 00:1b:21:6e:05:1e 1500 Intel Corporation 82599EB 10 Gigabit Network Connection
vmnic2 06:00.00 ixgbe Up 10000Mbps Full 00:1b:21:6e:05:68 1500 Intel Corporation 82599EB 10 Gigabit Network Connection
is there a reason why you aren't using Jumbo Frames? I would talk to your network team to find out if the switches you are connected to support jumbo frames, and are enabled, and I would enable jumbo frames on your end to see if this will help.
To create a Jumbo Frames-enabled vSwitch:
1. Log in directly to the ESX host console.
2. To set the MTU size for the vSwitch, run the command:
esxcfg-vswitch -m
Our Network admins are hesitant to enable jumbo frames even though our Cisco switches support them. They want more details on WHY they should enable it since they have never had to in the past.
Do you have any documentation or best practices on enabling jumbo frames?
here's what I have
http://blogs.vmware.com/networking/2010/05/vsphere-loves-10gige.html
http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-496511.pdf
http://download.intel.com/support/network/sb/10gbe_vsphere_wp_final.pdf
I'm not saying switching to Jumbo Frames will resolve your issue, but if you are going to use 10GB, then it would only make sense to enable jumbo frames.