VMware vSphere

 View Only

PSOD caused by bnx2 driver for FCoE adapter

  • 1.  PSOD caused by bnx2 driver for FCoE adapter

    Posted Jun 24, 2019 08:31 PM

    Hello,

    Opened ticket with VMware support regarding host PSOD. We have had 3 random host crashes over the past 5 weeks.

    Support is telling me PSOD is the result of bnx2fc driver causing deadlock contention. We have HP ProLiant DL380 Gen9 servers with 4 CNA adapters, two for network two for storage. With FCoE enabled on ports ESXi is trying to discover FCoE vlan and targets, when ESXi doesn’t discover FCoE connection it continues to have driver discover again. Support is indicating during the discovery process bnx2fc is causing deadlock contention and that is causing the PSOD, do to faulty driver code.

    I am being told there is no KB or advisory, that its only documented internally. When searching I found the following kb, based on what support sent me the symptoms and workaround are almost identical to what is outlined in kb - https://kb.vmware.com/s/article/2120523

    I also found other third party sites discussing issue

    https://www.teimouri.net/retry-vlan-discovery/

    None of these make mention of bnx2fc driver causing contention during the discovery. I would think this driver was known to cause PSOD it would be documented. Wanted to post see if anyone had feedback. Support is recommending I run the following command esxcfg-module -s "bnx2fc_default_vlan=0" bnx2fc

    My understanding is that command is used for managing the kernel module drivers. If we set the default vlan = 0 what the ramifications would be for adapters that are connected to storage. 

    Any in-site is appreciated.

    Thanks,