VMware Cloud Community
Alwahidi
Contributor
Contributor

vSphere 4.1 vDS performance impact

I have built my vSphere environment on HP G7 blade servers with Flex10 virtual connect. first couple of weeks were perfect. after some time i started having performance issues with all my VMs (including vCenter). response was slow when accessing VMs, high network latency, and large set of consequent errors reporting that paths to all datastores frequestly change power status.

I removed two hosts from the vDS for testing sake, and waited to see if the problem is isolated, but the problem reamined.

Then I opened a case with the FC fabric vendor (EMC), they rercommended changing the auto-negotioation of the switch ports from "Auto" to "Static". this did solve the problem, and the datastore connectivity message disappered for a while. today, I intended to return the two hosts that I removed from the vDS back, a couple of minutes later, the message appeared again and the same performance issue affecting the VMs surfaced.

any clue would be appreciated.

P.s. I will attached relevant logs by tomorrow.

Abdul M. Alwahidi
Reply
0 Kudos
6 Replies
peetz
Leadership
Leadership

Hi,

there have been a lot of bugs with the firmware and drivers of the Emulex adapters in the HP G7 blades, especially related to vDS and VLAN tagging.

Be sure that you have the current firmware (available here: http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?swItem=co-99913-1) and use the recommended drivers for ESXi 4.1 that are listed here: http://kb.vmware.com/kb/2007397

We are using these versions for about two months now and have not had any problems since then.

- Andreas

Twitter: @VFrontDe, @ESXiPatches | https://esxi-patches.v-front.de | https://vibsdepot.v-front.de
Reply
0 Kudos
Alwahidi
Contributor
Contributor

it seems the problem isnt related to vDS after all. I removed the host from vDS and from vCenter, still eith the same issue.

How can i know the exacr version of my storage and network adatpters from within ESXi 4.1 U2? I know the server is using be2net driver for the network, but I cannot determine its version nor the model and driver version of my storage adapter. since I'm using VUM in my setup, i can at least see that no driver update patches are downloaded. does this mean that they are up to date? do i have to focus now on my Flex firmware version?

attached is the error message I get in ESXi and some relative logs.

Abdul M. Alwahidi
Reply
0 Kudos
peetz
Leadership
Leadership

Hi,

please check the driver and firmware versions in the vSphere Client. Look a the "Hardware Status" tab, expand "Software Components", and look there for "be2net device firmware" and "be2net driver" versions.

To install the current drivers you need to download them from VMware (links are in the KB article). Extract the offline_bundle files from the downloads and import them into VUM. You can then install them through VUM.

- Andreas

Twitter: @VFrontDe, @ESXiPatches | https://esxi-patches.v-front.de | https://vibsdepot.v-front.de
Reply
0 Kudos
mitchellm3
Enthusiast
Enthusiast

To check your firmware with HP blades.

Network adapters - from cli run

ethtool -i vmnic0

If using flexfabric (emulex cna)

cat /proc/scsi/lpfc820/1

If using a qlogic mezzanine

cat /proc/scsi/qla2xxx/3

I just had to update my last flexfabric farm due to a SAN upgrade...same datastore errors so I'm assuming you are using FlexFabric.  These are the levels we brought ourselves to to get rid of the errors:

Update nc533i OneConnect firmware to 4.0.360.15  (HP site)

Update Emulex driver to 8.2.105.34  (VMware site)

Update be2net driver to 4.0.306.0  (VMware site)

Upgrade ESXi 4.1 hosts to build 502767 (update manager)

We have decided to not use the flexfabric fabric piece in our environment and instead use the qlogic mezzanine cards and brocade switches.  FlexFabric has burnt too many bridges.  Flex-10/Flexfabric networking has been pretty solid.

Reply
0 Kudos
Alwahidi
Contributor
Contributor

Thanks Mitchell.

I checked ethernet card and HBA firmware versions along with adapter driver versions, they are all set to the latest.

what I noticed though was a high congestion on the CX array I'm placing my datastores in. will dig deeper into this and keep you posted.

Abdul M. Alwahidi
Reply
0 Kudos
Alwahidi
Contributor
Contributor

After openning a case with EMC, it turned out the array is having a lot of trespassing occurances due to some inconsistency in distributing the datastore LUN over the disks spindles. We created a new LUN over less loaded spindles and Storage vMotioned the VMs to the new datastore, everything looks steady for now.

Thanks.

Abdul M. Alwahidi
Reply
0 Kudos