We have updated driver & firmware for the Broadcom 10Gb NICS
Since then we have seen the following log spew on the host:
2016-01-14T10:11:25.323Z cpu12:33130)<3>[bnx2x_alloc_rx_sge:734(vmnic0)]Can't alloc sge
2016-01-14T10:11:26.186Z cpu12:33130)<3>[bnx2x_alloc_rx_sge:734(vmnic0)]Can't alloc sge
2016-01-14T10:11:27.189Z cpu12:33130)<3>[bnx2x_alloc_rx_sge:734(vmnic0)]Can't alloc sge
2016-01-14T10:11:27.989Z cpu12:33599)<3>[bnx2x_alloc_rx_sge:734(vmnic0)]Can't alloc sge
Anyone experience similar or know if this is a concern?
we have experienced weird problems with upload (stalled) on some heavy network utilized vms. ESXi hosts were HPs blades with driver 2.712.70.v55.3 (fw 7.13.23), what is recommended version in HP recipe (https://vibsdepot.hpe.com/hpq/recipes/HPE-VMware-Recipe.pdf)
esxi logs with version 2.712.70.v55.3 are full of
2016-06-27T16:28:23.868Z cpu14:32868)<3>[bnx2x_esx_init_rx_ring:1946(vmnic0)]disabling TPA for queue
2016-06-27T16:28:23.890Z cpu14:32868)<3>[bnx2x_dynamic_alloc_rx_queue_single:789(vmnic0)]Could not start queue:1
2016-06-27T16:28:28.866Z cpu0:32868)<3>[bnx2x_alloc_rx_sge:734(vmnic0)]Can't alloc sge
We are testing this never version 2.713.10.v55.4 and it seems promising. I'm planing opening support case with vmware and hp.
I have found Slow backup performance
mentioning this errors related to slow performance.
after year we are experiencing similar behavior after updates. ESXi is 6.5 Update 1 (6.5.0 #1 SMP Release build-5969303 Jul 6 2017) with latest HP driver
Bus Info: 0000:04:00.1
Firmware Version: bc 7.13.75
I have found that newer driver should exist, but download is not available at Download VMware vSphere
I also have found blog post which mentions some bricked network adapters with this driver http://www.thevirtualist.org/bricked-qlogic-broadcom-bcm57840-driver-update/
Maybe that is reason for call out?
Anyone experiencing same bahavior again?
Please see this link for the newer driver - https://my.vmware.com/group/vmware/details?downloadGroup=DT-ESX60-QLOGIC-BNX2X-271330V608&productId=...
I have the same issue on Broadcom cards in my hosts with this in the logs and it is actually causing APD in my iSCSI adapters. I am going to try the newer driver to see what happens as we have the latest firmware - 7.14.37.
Thank you for link for the driver, finally I have release notes. I have found driver itself before in dell iso image and extracted it, but without change log.
We can see there
2. Problem: CQ93244: VM traffic stops after error recovery.
Change:Invoke stop_queue() before disable TPA, to aviod memory leak.
Free Rx filter table during Recovery unload, to avoid memory leak.
We can see in logs that memleak before first bnx2x_esx_init_rx_ring error:
2017-09-19T21:18:34.615Z cpu3:65645)vmklinux: alloc_pages:1010: This message has repeated 1 times: gfp_mask=0xd0, order=0x0, vmk_PktSlabAllocPage returned 'Out of memory'
2017-09-19T21:18:34.615Z cpu3:65645)<3>[bnx2x_esx_init_rx_ring:1960(vmnic0)]was only able to allocate 192 rx sges
It seems HPE image iso and HPE drivers relase contains this problematic driver. DELL have newer - fixed driver.
"HPE has released SPP 2017.07.2 as a replacement for SPP 2017.07.1 due to a recently discovered Software/Firmware issue with the HPE QLogic NX2 10/20GbE Multifunction Drivers for VMware vSphere 6.0/6.5, version 2017.07.07 "