Hello,
We are encountering a werid behaviour with some on our servers.
They are all running 6.7, fully patched and running on R720xd with latest firmware/bios
Without any warning, one of the card in the switch starts to drop all transmit packets (esxtop goes to 100% for a second) and it is not constent.
We changed cables, same issue, The CDP information are not showned anymore on the switch and the port doesn't show any errors or drops.
We did a packet capture, RX traffic is still working but nothing goes out through this nic
The network card is a Intel Corporation 82576 Gigabit Network Connection using igb driver
The only way is to restart the host or bring the port on the switch down/up
Oli
Using this info 0000:45:00.1 8086:10e8 8086:a02c vmkernel vmnic7 , I searched in HCL, I see this card is not supported for ESXi 6.7
Does it happen on all the hosts with all the NIC adapters? Or problem with just one host or a specific NIC adapter or a specific switch port?
Cheers,
Supreet
Hello Supreet,
It happens on 2 hosts out of 5 to be more precise. The second one doesn’t break since we changed all cables,
The one most impacted tends to be always the same nic or at least another nic on the same hardware card which holds 4 ports.
We didn’t have the problem while running on 6.5 but before the upgrade we did also update all firmware which doesn’t help to pin point the root cause unless I find evidence through a log or capture.
It looks like a driver/firmware issue at the intel card level. Can you provide us the firmware and driver version of the card with card model (run vmkchdev -l |grep -i <vmnic name>)
Here you go,
0000:45:00.1 8086:10e8 8086:a02c vmkernel vmnic7
Oli
Here is some extra
ethtool -i vmnic7
driver: igb
version: 5.3.3
firmware-version: 1.5, 0x0001616a
bus-info: 0000:45:00.1
I checked VMware compatibility page, it seems we are good.
Using this info 0000:45:00.1 8086:10e8 8086:a02c vmkernel vmnic7 , I searched in HCL, I see this card is not supported for ESXi 6.7
Hardware compatibility does matter in such scenarios. To isolate if it is something to do with the hardware, swap the affected NIC switch ports with a non-affected host and observe if the non-affected host becomes affected.
Cheers,
Supreet
Thanks !
I missed the dell filter indeed. Since it is unsupported,
We replace them with x series we have laying around and are supported..
Thanks all for you help
Oli