- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Problem with ESXi 5 or Cisco servers (hosts disconnected)
Hi, since we change our servers for our ESXi 5 cluster, we have sometime host deconnection.
Servers specs is : 3x Cisco UCS C200 M2 (2x Xeon X5675 @ 3.06GHz and 96Gb memory)
Some times, without warnings, the host disconnect in vCenter.
We can still ping the host but cannot SSH or log with vSphere Client directly. If I want to log locally, as soon as I type the password and ENTER, it freeze and the only fix is hard restart. (and lost local logs)
While the host is disconnected, guests are disconnected but fully operationnal (remote desktop, ping). We can't vMotion or change settings on the guest since they are (disconnected)
Is anyone already saw that problem, it's really annoying.
Let me know if you need more informations.
Thanks and have a nice day,
Martin Bergeron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
have you did some basic troublshooting like services are port number .and also dont forget that service blocked by any virus or firewall.
____________________
Always desire to learn something useful. ![]()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You may be facing some local disk problems. Add a vMA to your environment and use it to collect the logs from the ESX servers: http://kb.vmware.com/kb/1024122
Also I am quite sure there is a syslog server included on vCenter installation. Use it if vMA does not work.
I had a very simmilar issue where the KB http://kb.vmware.com/kb/1030265 resolved the issue. You can try on a test ESX to check if this makes the problem stop.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks,
It really seems to be a HDD controller or HBA so I tried that solution http://kb.vmware.com/kb/1030265 but need to wait some days to make sure that servers don't go down again.
I'll let you know in few days.
Thanks again!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
9 days and still up... ![]()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Martin,
we have the same issue. After reading your post we applied the solution of disabling the interrupt mapping.
The problem is that we don't see the ALERT message on the logs.
Is your enviroment still OK?
Did you find the ALERT message on logs?
Thanks and Best Regards,
Raul de la Flor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Raul,
I’m still up since 12 days from now and I didn’t see the ALERT because when the problem occurred and host has been cold reboot, every logs was lost.
But everything seems to be ok now.
I'll update thread in few days to let you know.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Disable interrupt mapping on the host did the trick for me.
23 days now without disconnecting ![]()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot!! 15 days with no disconnections!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Good to hear.... bad to Cisco/VMware, they really need to address this thing up...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ER 89872 Cisco Bios issue found. From Cisco: "I have an update on this. Today we were able to connect the dots and found that we recently fixed a problem related to interrupt remapping in the VT-d. Apparently, the work around of disabling interrupt remapping may not always solve the problem.
In the past we have seen many different manifestations of this issue, adapter disconnects and sometimes even PSODs.
Contact Cisco to obtain necessary Bios Updates.