wilsonlopes00
Contributor
Contributor

Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

I have installed  ESXi5.5 in a server with Emulex OneConnect 10Gb NICs.

I have installed the last driver for this nic - elxnet-10.0.575.9-1OEM.550.0.0.1331820.x86_64.vib.

After some network activity of virtual machines, the interfaces go down, even the switch ports are up.

vmnic4  0000:05:00.00 elxnet      Down 0Mbps     Half   00:00:c9:e4:13:16 9000   Emulex Corporation OneConnect 10Gb NIC

vmnic5  0000:05:00.01 elxnet      Down 0Mbps     Half   00:00:c9:e4:13:18 9000   Emulex Corporation OneConnect 10Gb NIC

Here is the logs

2013-11-19T15:49:12.395Z cpu2:33376)WARNING: elxnet: elxnet_detectDumpUe:238: 0000:005:00.0: UE Detected!!

2013-11-19T15:49:12.396Z cpu2:33376)elxnet: elxnet_detectDumpUe:249: 0000:005:00.0: Forcing Link Down as Unrecoverable Error detected in chip/fw.

2013-11-19T15:49:12.396Z cpu2:33376)WARNING: elxnet: elxnet_detectDumpUe:257: 0000:005:00.0: UE lo: MPU bit set

2013-11-19T15:49:12.892Z cpu5:33377)WARNING: elxnet: elxnet_detectDumpUe:238: 0000:005:00.1: UE Detected!!

2013-11-19T15:49:12.892Z cpu5:33377)elxnet: elxnet_detectDumpUe:249: 0000:005:00.1: Forcing Link Down as Unrecoverable Error detected in chip/fw.

2013-11-19T15:49:12.892Z cpu5:33377)WARNING: elxnet: elxnet_detectDumpUe:257: 0000:005:00.1: UE lo: MPU bit set

Anyone have a similiar trouble?

Tags (2)
122 Replies
Ryanotown22
Contributor
Contributor

I am using elxnet and update driver 10.2.298.5 and firmware 10.2.340.19.  I seem to be getting disconnects packets dropped under high load such as vmotions.  Sometimes vmotions will fail or take extremely long time to complete.   We are running 5.5 update 2.   I opened a case with HP and VMware for help on this I will post any response I get

AlbertWT
Virtuoso
Virtuoso

Ryanotown22, yes me too, I'm having a problem with my VM network issue where TCP retransmissions and TCP resets is quite high intermittently.

I'm running ESXi 5.1 Update 1 on my HP Blades BL 465c G7 & 8 please let us know how did you go with your case logged to VMware as I'm keen to know what could be the culprit of this issue.

/* Any kind of comment or input would be greatly appreciated */
0 Kudos
AlbertWT
Virtuoso
Virtuoso

wilber822 have you tried to use the driver be2net-10.2.293.0-1869542.zip ?

I was seeing similar symptoms about random intermittent ESXi host disconnecting and I also see other errors related to NIC that has caused the VMs to disconnect and caused this host to be in a stuck mode.

This is what I found from the logs.

VMware ESXi 5.1.0 Update 1

*** vmkernel.log ***

2015-02-12T12:06:16.604Z cpu12:33621979)vmnic6: UE happened ...   <---  Unrecoverable Error on the NIC, this has caused the NICs to crash

2015-02-12T12:06:16.604Z cpu12:33621979)ue_status_low: 0x20

2015-02-12T12:06:16.604Z cpu12:33621979)ue_status_hi: 0x0

2015-02-12T12:06:16.604Z cpu12:33621979)ue_status_low_mask: 0x4000140

2015-02-12T12:06:16.604Z cpu12:33621979)ue_status_hi_mask: 0x0

/* Any kind of comment or input would be greatly appreciated */
0 Kudos
AlbertWT
Virtuoso
Virtuoso

PeteSu let us know how did you go with the new ESXi network driver and the Firmware version that is stable 🙂

/* Any kind of comment or input would be greatly appreciated */
0 Kudos
jessem
Enthusiast
Enthusiast

Yes, I am another customer that is having problems with this same exact setup.  During the evening we get paged out that vCenter sees hosts as disconnected but in reality packets are getting dropped.  I do a ping test from the vcenter to any host and the transmission loss is avg about 14%. 

I have vCenter ESXi 5.5 U2d running with 5.5 hosts on HP 5.5 2302651 builds.

The drivers/firmware we had running were:

10.2.340.19

driver: 10.2.298.5

We downgraded the driver to 10.0.783.13 but still experienced same issue.  I am trying to find the firmware: 4.9.416.01 but can't find the download anymore and HPs site is down for that 554FLB card.

PeteSu
Contributor
Contributor

AlbertWT, so far, driver 4.9.288.0 and firmware 4.9.416.0 @ legacy mode on our NC553i seems to be stable.  No receive packet losses compared to the native mode 10.0.725.2 and 10.2.298.5 drivers.

Emulex has 10.2.445.0 posted on their website, and others in this thread have reported some success with development driver 10.2.261.625.

I haven't been able to get HP to provide an ETA on when they'll have the new Emulex drivers certified.  Considering that we've had issues with both the 10.0.x and 10.2.x drivers in native mode, if legacy mode and 4.9.x provides more stability in our environment, then we'll stick with that for now.

Of course, your mileage may vary, so you should test in a development host before applying to the rest of your environment.  If you haven't already done so, I would also recommend opening a support case with HP.  With more people reporting issues with Emulex based VCs, maybe they'll hurry up and certify the new drivers (and hopefully test thoroughly this time).

There are 2 HP advisories for the 4.9.416 firmware, so you should check to see if they apply for your VC model.

http://h20566.www2.hp.com/hpsc/doc/public/display?sp4ts.oid=4145106&docId=emr_na-c04218016&docLocale...

http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04326096

wilber822
Enthusiast
Enthusiast

Hi PeteSu,

Thanks for sharing your case.

The debug driver we used is for the problem - "Emulex NIC lost network connectivity on ESXi 5.5". We ran the driver in our environment stable.

I have to give negative comments to HP support since I have asked them when they will have a GA release for my issue about 1 weeks ago, there was no any responding till now.

Hopefully they will give you a ETA soon.

In  other hand, looks like lot of people have problem on Emulex 10.x drivers on ESXi 5.5, I think it significantly impacts production.

https://www.zhengwu.org
wilber822
Enthusiast
Enthusiast

Hi AlbertWT

My understanding  is whenever you see a Unrecoverable Error on NIC, it similar to my issue. You should also observe high count of InPauseFrame on virtual connect module if you are using HP blade system.

UE error is not fixed yet by Emulex for particular NIC module.

https://www.zhengwu.org
0 Kudos
MartynThomas
Contributor
Contributor

Could any of you guys who have the same UE issue with the NC55x cards let me know what version of OA and VC firmware you're running?

Cheers,

Martyn

0 Kudos
AlbertWT
Virtuoso
Virtuoso

Hi Jess,

Try this site; ftp://ftp.hp.com/pub/softlib2/software1/pubsw-generic/p520687518/v101011 and let us know how you go with the firmware downgrade performance.

/* Any kind of comment or input would be greatly appreciated */
0 Kudos
AlbertWT
Virtuoso
Virtuoso

Pete,

Thanks for the reply. The HP advisory page that you sent over to me http://h20566.www2.hp.com/hpsc/doc/public/display?sp4ts.oid=4145106&docId=emr_na-c04218016&docLocale... suggest that there is a problem with the Emulex be2net firmware version 4.9.416.2

but according to the latest February 2015 HP-VMware Recipe book page 17:

Recipe.jpg

it shows that the stable version is 4.9.416.2

Source: Recommended Firmware and Driver for HP http://vibsdepot.hp.com/hpq/recipes/ (Feb2015VMwareRecipeSPP201409_16.4.pdf)

/* Any kind of comment or input would be greatly appreciated */
0 Kudos
Ryanotown22
Contributor
Contributor

I have been moved onto level 2 support.   Waiting for an update from them.  Hosts keep having pauses/freezing,  network RX drops, vmotions stick, also All Path Down to our NFS storage. I am on the latest firmware/drivers provided by HP in the ESXI HP 5.5 update 2 disk and HP SPP disk.  These are all elxnet.

HP DL360p G8
Emulex HP NC552SFP Dual Port 10GBE
firmware 10.2.340.19
Driver 10.2.298.5

Emulex HP FlexFabrix 10GB 2 Port 554FLR-SFP+
firmware 10.2.340.19
Driver 10.2.298.5


BL 460c G8
Emulex HP Flexfabric 554FLB 10GB 2 Port

Firmware 10.2.340.19
Driver 10.2.298.5

BL 460c G7
Emulex HP NC553i Dualt Port FlexFabric 10GB Converged Network Adapter
Firmware 10.2.340.19

Driver 10.2.298.5

0 Kudos
jessem
Enthusiast
Enthusiast

Thanks everyone for chiming in.

So far we have been running error free for 2 days now running the following on the Emulex 554FLB.

elxnet driver: 10.0.783.13

elxnet device firmware: 4.9.416.0

...obviously not running legacy mode.

Ryanotown22
Contributor
Contributor


I was just told to downgrade to this.. and HP support said they had another person downgrade to this and had no issues.   Also they said in the next few weeks emulex is releasing a new firmware that will fix this and this will be included in the March recipe book for HP.   I am going to try this now and check stability

Firmware 4.9.416.0

Firmware:   http://h20564.www2.hp.com/hpsc/swd/public/detail?sp4ts.oid=5215387&swItemId=co_131997_1&swEnvOid=54

Driver 10.0.725.2

Driver: https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI55-EMULEX-ELXNET-1007252&productId=353

0 Kudos
AlbertWT
Virtuoso
Virtuoso

Thanks jessem for the detailed reply. However my HP BL 465c G7 Emulex model is NC 551i not 554 FLB as in my HP BL 465c G8.

Somehow my HP support engineer working on my case suggest me to perform the following versions based on the HP-VMware recipe PDF February 2015:

firmware version 4.9.416.2

driver version 10.2.293.0

Here's the version that is currently running on the HP BL 465c G8 using 554FLB:

~ # ethtool -i vmnic0

driver: be2net

version: 10.2.293.0

firmware-version: 10.2.340.19

bus-info: 0000:04:00.0

Cheers,

/* Any kind of comment or input would be greatly appreciated */
0 Kudos
jessem
Enthusiast
Enthusiast

are you running 5.5 with that driver on your g8?

0 Kudos
jessem
Enthusiast
Enthusiast

All,

unfortunately this package didnt work for us. So now I guess we will downgrade the driver one more level.

0 Kudos
AlbertWT
Virtuoso
Virtuoso

Hi Jesse,

Not yet, my ESXi are all on 5.1 Update 2

/* Any kind of comment or input would be greatly appreciated */
0 Kudos
AlbertWT
Virtuoso
Virtuoso

let us know how it goes mate after you've downgrade the driver and firmware sets.

/* Any kind of comment or input would be greatly appreciated */
0 Kudos
Ryanotown22
Contributor
Contributor

I downgraded but recently had an odd issue with linux vm's that had NFS mounted internally had them disconnected.  Not sure if it was related and there seemed to be nothing in the logs about it

0 Kudos