VMware Cloud Community
wilsonlopes00
Contributor
Contributor

Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

I have installed  ESXi5.5 in a server with Emulex OneConnect 10Gb NICs.

I have installed the last driver for this nic - elxnet-10.0.575.9-1OEM.550.0.0.1331820.x86_64.vib.

After some network activity of virtual machines, the interfaces go down, even the switch ports are up.

vmnic4  0000:05:00.00 elxnet      Down 0Mbps     Half   00:00:c9:e4:13:16 9000   Emulex Corporation OneConnect 10Gb NIC

vmnic5  0000:05:00.01 elxnet      Down 0Mbps     Half   00:00:c9:e4:13:18 9000   Emulex Corporation OneConnect 10Gb NIC

Here is the logs

2013-11-19T15:49:12.395Z cpu2:33376)WARNING: elxnet: elxnet_detectDumpUe:238: 0000:005:00.0: UE Detected!!

2013-11-19T15:49:12.396Z cpu2:33376)elxnet: elxnet_detectDumpUe:249: 0000:005:00.0: Forcing Link Down as Unrecoverable Error detected in chip/fw.

2013-11-19T15:49:12.396Z cpu2:33376)WARNING: elxnet: elxnet_detectDumpUe:257: 0000:005:00.0: UE lo: MPU bit set

2013-11-19T15:49:12.892Z cpu5:33377)WARNING: elxnet: elxnet_detectDumpUe:238: 0000:005:00.1: UE Detected!!

2013-11-19T15:49:12.892Z cpu5:33377)elxnet: elxnet_detectDumpUe:249: 0000:005:00.1: Forcing Link Down as Unrecoverable Error detected in chip/fw.

2013-11-19T15:49:12.892Z cpu5:33377)WARNING: elxnet: elxnet_detectDumpUe:257: 0000:005:00.1: UE lo: MPU bit set

Anyone have a similiar trouble?

Tags (2)
122 Replies
wilber822
Enthusiast
Enthusiast

Anybody experience same problem? Did you get it fixed?

https://www.zhengwu.org
Reply
0 Kudos
ignos
Contributor
Contributor

In our case upgrading the Emulex card firmware fixed this issue.

Reply
0 Kudos
dsohayda
Enthusiast
Enthusiast

HP support sent us this link as a workaround regarding issues with our BL460c G7 running driver 10.2.298.5 and firmware 10.2.340.19 on the OneConnect 10Gb Emulex NC553i;

VMware KB: Emulex OneConnect network cards missing with elxnet driver 10.0.725.2 and later in ESXi 5...

AlbertWT
Virtuoso
Virtuoso

So in this case is there any update for this problem ?

/* Please feel free to provide any comments or input you may have. */
Reply
0 Kudos
wilber822
Enthusiast
Enthusiast

For my case. HP updated me that Emulex found something in OneCapture logs. They are working on solution now.

It's been 3 months since first outage.

https://www.zhengwu.org
AlbertWT
Virtuoso
Virtuoso

Wilber, thanks for the update.

So let us know here what will be the patch or the update to be applied.

/* Please feel free to provide any comments or input you may have. */
Reply
0 Kudos
Alistar
Expert
Expert

We had the same issue in our environment a few times as well. On a few hosts, the firmware and driver upgrade helped to alleviate this issue - a "network stress test" consisting of copying a larger VMDK to another ESXi host from the SSH shell revealed whether the issue was remedied or not (usually the UE happened within 15 minutes). But on one server it was indeed a hardware error and since we had the NIC replaced, these stopped appearing.

Stop by my blog if you'd like 🙂 I dabble in vSphere troubleshooting, PowerCLI scripting and NetApp storage - and I share my journeys at http://vmxp.wordpress.com/
Reply
0 Kudos
wilber822
Enthusiast
Enthusiast

HP sent me a debug driver to collect additional logs if the issue happen again.

But somhow the debug driver and problem driver both cannot trigger the issue again....I'm still trying to re-produce the issue.

https://www.zhengwu.org
MartynThomas
Contributor
Contributor

Sorry to dig up an old thread, but did you ever receive a resolution to this issue?

I recently pushed out the recommended firmware and drivers listed on the HCL to a number of blades in my estate and I've encountered the same issue. I've had the same fault across a number of blades so I fail to see this as a hardware fault, and the very same blades remain stable if we use the unsupported be2net driver instead.

Thanks,

Martyn

Reply
0 Kudos
jhwagner
Contributor
Contributor

I have a slightly different problem with the Emulex 10Gb NIC.  I'm running HP BL460C G7s in the C7000 Chassis with the Emulex 10GB Enet cards.  My esxi 5.5 hosts stay connected fine but I have virtual machines on several different vlans and the virtual machines appear to lose the ability to talk at random times.  Sometimes I have to disconnect their nic and reconnect it to fix the problem, other times I can only get them to talk by migrating to a different host.  I was using the be2net but HP had me change it to the elxnet and update driver to 10.2.298.5 and firmware to 10.2.340.19.  This did not resolve my problems.  VMware hasn't been able to help either.  Anyone else experiencing this or has and knows a solution?

MartynThomas
Contributor
Contributor

I have that issue too, although ironically I don't see the issue at my other sites using identical hardware, firmware and software builds.

Reply
0 Kudos
wilber822
Enthusiast
Enthusiast

Guys, Emulex confirmed that's driver issue. They give me a beta driver contains possible fix, but I'm not able to reproduce the issue any more, even by original driver. So I can't say it's fixed...

https://www.zhengwu.org
Reply
0 Kudos
AlbertWT
Virtuoso
Virtuoso

Let us know the result here wilber822

/* Please feel free to provide any comments or input you may have. */
Reply
0 Kudos
LordChares
Contributor
Contributor

Hey wilber,

please let us know how it goes,

im having a similar issue,

of the 2 copper internal interfaces of HS23, one connects correctly, the other one never connects, even if it gets link and negotiates speed, it never gets a DHCP response, nor having a static IP has connection.

checked all switches, even replaced switch modules with base config, also updated all firmwares, blades, chassis, switches, san switches, etc. with no avail.

running on ESXi 5.5u2 fully patched.

Reply
0 Kudos
jhwagner
Contributor
Contributor

I've tried it on our current test build of 5.5 u1 and I've tried it on 5.5 u2 with the suggested drivers and firmware that HP gave me.  No luck.  I've even tried running the be2net drivers and I haven't had any luck.  If I can't get this fixed soon it will be a deal breaker and I'll have to reload our entire environment with 5.1...  I'm already running a small 5.1 environment for our other network and interesting enough I don't have this problem and I'm using the exact same equipment...go figure.

Hoping you guys can help me figure it out soon...........  I've had tickets opened up since August

Reply
0 Kudos
LordChares
Contributor
Contributor

Hey Guys,

any updates?

has anybody had the same problem as me? one interface works, the other one, gets link, gets negotiated, but doesnt get IP nor it can communicate?

Cheers

Reply
0 Kudos
MartynThomas
Contributor
Contributor

I'm really struggling with this at the moment, there appears to be so much conflict in the supported/recommended firmware and driver combinations.

HP are saying the following:

Elxnet Driver version: 10.2.298.5

Firmware version: 10.2.340.19

According to http://vibsdepot.hp.com/hpq/recipes/HP-VMware-Recipe.pdf

VMware are saying, use the following:

Elxnet Driver version: 10.2.298.5

Firmware version: 10.2.298.21

According to the HCL: VMware Compatibility Guide: I/O Device Search


On the other hand, they're also saying use this:

Elxnet Driver version: 10.2.298.5

Firmware version: 10.2.340.10

According to the HP Flex-10 / Flex-Fabric Doc: http://partnerweb.vmware.com/programs/hcl/ESX_Flex_config.pdf

Emulex are claiming that the following are recommended:

Firmware version: 10.2.323.39 or 10.2.340.19

According to VMware Recommended Software Matrix

Emulex have refused to help me as my device is not a true Emulex product and have referred me to HP. HP are struggling to understand the problem and keep pointing to me to firmware I already have and VMware want me to try the firmware recommended on their HCL, however I can't appear to find the firmware in order to try.

Does anyone have either firmware 10.2.298.21 or 10.2.340.10?

Cheers,

Martyn

Reply
0 Kudos
wilber822
Enthusiast
Enthusiast

Hi LordChares,

I don't think your issue is similar like mine. 🙂

https://www.zhengwu.org
Reply
0 Kudos
wilber822
Enthusiast
Enthusiast

Hi jhwagner,

I'm considering roll back to ESXi 5.1 U1 since we also have lot of  other  problem on network/storage after upgraded to  ESXi 5.5 U1. That's definitely unstable version.

https://www.zhengwu.org
wilber822
Enthusiast
Enthusiast

Hi MartynThomas,

I used the one below, got problem.

Elxnet Driver version: 10.2.298.5

Firmware version: 10.2.340.19

HP told me that's  driver issue, not firmware issue after they worked with Emulex.

Now I'm testing driver version  10.2.261.6251-1OEM.550.0.0.1331820 which  HP provided  me with DEBUG options and possible fix. They told me I need to collect vm-support if the issue present again as the logs will contains additional info.

Unfortunately, I cannot re-produce  the issue again.

Then I re-installed  driver 10.2.298.5 but same cannot re-produce the  issue. I simulated high network utilization on VM network and vmk port but no lucky. 😞

https://www.zhengwu.org