-
1. Re: unresponsive host
brunofernandez1 Jun 1, 2015 12:09 AM (in response to shaheed_a_m)i would recommend to restart the management agents:
VMware KB: Restarting the Management agents on an ESXi or ESX host
what happens if you rightclick the disconnected server in vcenter and say reconnect? does ke asks you for credentials?
maybe he has lost the certificates by upgrading to vpshere 6.0 and now he thinks that this could be another server.
so you have to reconnect them manuallly
-
2. Re: unresponsive host
shaheed_a_m Jun 1, 2015 1:10 AM (in response to brunofernandez1)hi
i haven't tried to disconnect and reconnect the host. but i did try to right click and selected connect. this did nothing.
when i have the problem again I'll try to restart the management agent.
-
3. Re: unresponsive host
schulzman Jun 2, 2015 12:17 AM (in response to shaheed_a_m)Hello at all!
I had the same issue some days ago in two different environments. One standalone free ESXi 6.0 Hypervisor and one in a two-node-cluster managed by vCenter-Server-Appliance 6.0.
I tried to reconnect the host, but i didn't work for me.
At the DCUI is tried to enter my password, but the Host did not respond. Only the reboot did solve my problem. After that everything was fine.
I'm running the ESXi 6.0 on a Fujitsu RX200 S6 and RX 200 S7.
Please let me know if there is a fix for this issue.
Regards,
schulzman
-
4. Re: unresponsive host
RichardBush Jun 2, 2015 6:16 PM (in response to shaheed_a_m)Hi,
I have had something similar on an upgraded test host, the server would randomly disconnect and a reboot resolved it. Eventually the host wouldn't reconnect to vcenter at all.
The fix for me was up uninstall the vpxa agent, restart the host then reconnect to vcenter (as though connecting a new host)
R
-
5. Re: unresponsive host
vijayalka Jul 3, 2015 10:36 AM (in response to RichardBush)could you please confirm how you uninstall the vpxa agent...
-
6. Re: unresponsive host
vijayalka Jul 3, 2015 10:45 AM (in response to shaheed_a_m)link that can be user for uninstalling the vpxa agent
-
-
8. Re: unresponsive host
sdnbtech Jul 21, 2015 10:58 AM (in response to shaheed_a_m)If you're seeing this in your vmkernel.log at the time of the disconnect it could be related to an issue that will one day be described at the below link (it is not live at this time). We see this after a random amount of time and nothing VMware technical support could do except reboot the host helped.
http://kb.vmware.com/kb/2124669
vmkernel.log:
2015-07-19T08:22:35.552Z cpu0:33257)WARNING: LinNet: netdev_watchdog:3678:
NETDEV WATCHDOG: vmnic4: transmit timed out
2015-07-19T08:22:35.552Z cpu0:33257)WARNING: at vmkdrivers/src_92/vmklinux_92/vmware/linux_net.c:3707/netdev_watchdog()(inside vmklinux)
2015-07-19T08:22:35.552Z cpu0:33257)Backtrace for current CPU #0,worldID=33257, rbp=0x430609af4380
2015-07-19T08:22:35.552Z cpu0:33257)0x4390cf49be10:[0x418029896b4e]vmk_LogBacktraceMessage@vmkernel#nover+0x22 stack: 0x430609af4380, 0
2015-07-19T08:22:35.552Z cpu0:33257)0x4390cf49be30:[0x418029f1e7b7]watchdog_work_cb@com.vmware.driverAPI#9.2+0x27f stack: 0x430609ac3ce
2015-07-19T08:22:35.552Z cpu0:33257)0x4390cf49bea0:[0x418029f44a5f]vmklnx_workqueue_callout@com.vmware.driverAPI#9.2+0xd7 stack: 0x4306
2015-07-19T08:22:35.552Z cpu0:33257)0x4390cf49bf30:[0x41802984f872]helpFunc@vmkernel#nover+0x4e6 stack: 0x0, 0x430609ac3ce0, 0x27, 0x0,
2015-07-19T08:22:35.552Z cpu0:33257)0x4390cf49bfd0:[0x418029a1231e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0,
-
9. Re: unresponsive host
DSchef Aug 11, 2015 10:21 AM (in response to sdnbtech)sdnbtech, have you heard or seen any updates on the issue you described? I haven't been able to get an update on the status of a fix from VMware after about a few weeks after confirming VMware engineering is working on a solution. A host downgrade to 5.5 was the only recommendation aside from rebooting the 6.0 hosts each time networking drops.
-
10. Re: unresponsive host
cesprov Aug 12, 2015 8:34 AM (in response to DSchef)I seem to be having very similar issues:
2015-08-11T11:14:53.340Z cpu23:33256)WARNING: LinNet: netdev_watchdog:3678: NETDEV WATCHDOG: vmnic4: transmit timed out
2015-08-11T11:14:53.340Z cpu23:33256)<6>ixgbe 0000:41:00.0: vmnic4: Fake Tx hang detected with timeout of 160 seconds
When this happens, both ports on a dual port NIC die at the same time and only a reboot fixes it. I opened an SR with VMware support with reference back to here and the not-yet-exiting KB posted above and will follow up if/when I hear something back on this.
-
11. Re: unresponsive host
aragab Aug 13, 2015 10:41 AM (in response to shaheed_a_m)Troubleshooting a non-responsive host without looking at the logs is not really effective, You can open a service request with VMware.
-
12. Re: unresponsive host
Jimmy15 Aug 13, 2015 1:24 PM (in response to shaheed_a_m)share the log details, Without logs it is hard to find root cause. storage might also be the reason. APD recovery issue still unresolved in 6.0.
What about VMs on host , are they live when host go unresponsive? Even time sync make host disconnected.
-
13. Re: unresponsive host
cesprov Aug 14, 2015 8:51 AM (in response to cesprov)Confirmed what sdnbtech stated above. The "transmit timed out" is a known issue. No ETA on a time frame for release yet, not very forthcoming with details. Basically was told to downgrade if this issue is affecting me as there is no workaround. Engineer I spoke to says he sees this at least once a week.
-
14. Re: unresponsive host
sdnbtech Aug 24, 2015 11:33 AM (in response to cesprov)I checked this morning and there are a few options. 1) Apply a debug build of ESXi that will still be affected by the problem but gather more information for the development team, 2) There is a script that has to be run at each boot of each ESXi server that they believe fixes the issue entirely but can cause performance degradation, 3) Downgrade to 5.5 or below.
My case has now been open 60 days regarding this issue. It's very disappointing.