td3201
Contributor
Contributor

host disconnected -- may be down?

I have a host showing disconnected in VC. I cannot SSH into it and of course my remote console isn't working either. In VC, it shows that there are guests running on it and I can get to those guests. I don't believe that they are actually running on that host. Is there a way, from the guest, to tell which host it is actually running on or can I believe it?

Thanks!

0 Kudos
8 Replies
Troy_Clavell
Immortal
Immortal

within the VIC, click on the guest, then click on the summary tab, it will tell you what host it is running on.

0 Kudos
td3201
Contributor
Contributor

I left out an important detail; all the guests that show a part of this host show as disconnected too. When I look at the summary of one of those guests, it says they are on the disconnected host.

0 Kudos
Smitty23
Enthusiast
Enthusiast

I'd say they are on that host if they are showing as disconnected. Anytime I've experienced this issue, I go to the console on my ESX Server, login, and Restart the Management Agents. This should bring the server back to connected in VC, and allow you to export the logs.

0 Kudos
Troy_Clavell
Immortal
Immortal

can you ping a guest or the host in question? Even though the guest(s) shows up as disconnected, you should still be able to see the summary tab. If you can't ssh into the host in question, then you may have to reboot it.

....but, if the guests are alive, try restarting your VCMS service, that may refresh the guests.

0 Kudos
espi3030
Expert
Expert

I would try to ping the host or the VM's. Can you connect the VIC directly to that host? If you can afford the down time, I would migrate the VM's to another host or shut them down then SSH into the host in question and restart the management agent with

service vmware-vpxa restart

Hope this helps!

0 Kudos
Smitty23
Enthusiast
Enthusiast

I definately wouldnt reboot this host if your VM's are accessible. If your ESX Management IP is pingable, it means the hostd process has crashed. You can just login to the host and restart these. Don't take downtime to your VM's, simply restart the Management Services on the ESX Host. After you do this, you can send the log to VMWare and they'll find out why hostd crashed in the first place.

0 Kudos
td3201
Contributor
Contributor

I got into the console finally:

16:03:33:18.894 cpu0:1024)VMNIX: <0>scsi: device set offline - command error recovery failed: host 1 channel 0 id 0 lun 0

16:03:33:19.139 cpu0:1024)VMNIX: <0>journal commit I/O error

The ESX spash page is still there though. I attached a screenshot for fun.

0 Kudos
MikeMikeMike
Contributor
Contributor

Hi td,

This is a SCSI device not ready / offline class error indicating an issue with a SCSI drive. The drive can be local to the computer or remote connected by block mode protocol (SCSI3, FCP, etc). If you are using a SAN attached Array, confirm that the fibre channel interconnects are ok with no sharp bends and firmly seated in the SFP's then check the firmware of your HBA, firmware of the SAN switch and Firmware of the storage Array to confirm you are current. If its a local drive make sure its not getting too hot.

You should never see this error.

MikeMikeMike

0 Kudos