Well I admit this is something that has been bothering me for a while.
I have a Windows 10 VM that occasionally transfers lots of data onto another VM (Linux +SAMBA) on the same standalone ESXi 6.5 host.
The story goes that all of a sudden I get disconnected via RDP (or any other remote software I might use), this is not a Windows issue, the whole host gets disconnected from the network and the host + any VM running on it becomes unreachable. My only option is to log into the KVM and via the console trigger an ESXi restart.
I have spent weeks trying to troubleshoot this, I initially blames the network driver (igbn) which I've upgraded to the latest version with no luck.
When the host is network unreachable this is what I see via ALT+F12 I did change power policy so only the last two lines are relevant
Syslog:
Vmkernel:
Management host:
Can anybody please shed some light on what might be going on here and what I could attempt to fix this annoying issue?
Thanks!
Hello.
It seems that the Intel 82576 card has had quite a few problems on different servers (brands) both in version 6.5 and 6.7 which include driver issues (version 5.X) and firmware levels lower than 1.9 (your NIC has firmware 1.5).
You could try with an older 4.X driver.
Attached link
You could also try setting the speed on the card and on the switch port to 1GB, instead of auto.
The server manufacturer (Supermicro) should provide a utility to update the Firmware including the network cards.
First step here is to get the thread in the correct area, I’ve reported the thread so a moderator should move it for you.
Admittedly I couldn't find the right place where to post... glad this is now in the right place.
Can I please get some help in troubleshooting this issue now?
Thanks
VMTN is a user community forum, it's now down to forum members to offer help.
Hello.
You need to know the following details:
The build of version 6.5 you are using.
The type, model and brand of the server (if it is a brand like Lenovo, HPE, Dell and more).
The type and model of the Ethernet card
The firmware and driver version of the ethernet card you are using.
Hello, thanks for the answer. Here the details requested:
- Software:
Standalone ESXi: 6.5.0 Update 3 (Build 18678235)
- Custom build:
Manufacturer: Supermicr
Model: H8DG6/H8DGi
CPU: 32 CPUs x AMD Opteron(tm) Processor 6380
Memory: 63.98 GB
[root@BLACK:~] esxcli network nic list
Name PCI Device Driver Admin Status Link Status Speed Duplex MAC Address MTU Description
------ ------------ ------ ------------ ----------- ----- ------ ----------------- ---- --------------------------------------------------
vmnic0 0000:02:00.0 igb Up Up 1000 Full 00:25:90:dc:92:e8 1500 Intel Corporation 82576 Gigabit Network Connection
vmnic1 0000:02:00.1 igb Up Down 0 Half 00:25:90:dc:92:e9 1500 Intel Corporation 82576 Gigabit Network Connection
[root@BLACK:~] esxcli network nic get -n vmnic0
Advertised Auto Negotiation: true
Advertised Link Modes: 10BaseT/Half, 10BaseT/Full, 100BaseT/Half, 100BaseT/Full, 1000BaseT/Full
Auto Negotiation: true
Cable Type: Twisted Pair
Current Message Level: 7
Driver Info:
Bus Info: 0000:02:00.0
Driver: igb
Firmware Version: 1.5.3
Version: 5.3.2
Link Detected: true
Link Status: Up
Name: vmnic0
PHYAddress: 1
Pause Autonegotiate: true
Pause RX: true
Pause TX: true
Supported Ports: TP
Supports Auto Negotiation: true
Supports Pause: true
Supports Wakeon: true
Transceiver: internal
Virtual Address: 00:50:56:51:46:81
Wakeon: MagicPacket(tm)
Thanks!
Hello.
It seems that the Intel 82576 card has had quite a few problems on different servers (brands) both in version 6.5 and 6.7 which include driver issues (version 5.X) and firmware levels lower than 1.9 (your NIC has firmware 1.5).
You could try with an older 4.X driver.
Attached link
You could also try setting the speed on the card and on the switch port to 1GB, instead of auto.
The server manufacturer (Supermicro) should provide a utility to update the Firmware including the network cards.
Ok thanks.
I did perform my searches as well and found this interesting link:
https://www.vmware.com/resources/compatibility/detail.php?deviceCategory=io&productid=12667
Interesting because 5.3.2 is NOT a driver VMware would recommend for ESXi 6.5. I'm really interested to know how this version ended up in my system?
Any ways, before finding the above link I had discovered and downloaded the following three drivers:
[root@BLACK:/vmfs/volumes/59300515-a5ee2b49-81a2-00e081c9ed1e/ssh/igb] ls -la *.vib
-rw-r--r-- 1 root root 98012 Jul 24 2013 net-igb-4.2.16.8-1OEM.550.0.0.1198611.x86_64.vib
-rw-r--r-- 1 root root 104192 Jun 29 2016 net-igb-5.3.2-99.x86_64.vib
-rw-r--r-- 1 root root 118534 Mar 26 2017 net-igb_5.3.3-1OEM.600.0.0.2494585.vib
5.3.2 is the same as currently installed so I have downloaded as a backup solution only.
5.3.3 tried but fails like 5.3.2 did so no luck.
4.2.16.8 though... seems to be a good one! I've loaded the server with lots of IO and where I would have expected it to become network unreachable within 5 min I'm here a good 60 min afterwards and it's still there responding over the network!
btw the physical switch where the server is connected is unmanaged so I've forges 1Gbps as suggested at vSwitch level internally.
I would be very keen to upgrade the firmware of these cards but there's no reference what so ever on the SuperMicro Internet site. The only firmwares available are the BIOS and IPMI which are already at the latest version.
https://www.supermicro.com/Aplus/motherboard/Opteron6000/SR56x0/H8DG6-F.cfm