VMware Cloud Community
Moif_Murphy
Enthusiast
Enthusiast

Unresposive 5.1 host - possible management services issue

Hello,

I was enabling SNMP via SSH and unfortunately the host has become unresponsive. I can ping the host, I can SSH on to the host and I can see the VMs are still running on the host via esxtop so I'm not overly worried. I'm planning to reboot the host this evening but I want to confirm that HA will take over and move the VMs before shutting down the host. I'm beginning to lean towards a 'no' in that respect.

I can't connect to the host via the vSphere C# client. I've tried services.sh restart but it seems to halt on:

~ # /sbin/services.sh restart

Running vmware-fdm stop

Stopping vmware-fdm:success

Running bfa_cfg.sh stop

Running xorg stop

Running wsman stop

Stopping openwsmand

Openwsmand is not running.

Running sfcbd stop

This operation is not supported.

Please use /etc/init.d/sfcbd-watchdog stop

Running snmpd stop

root: snmpd is not running.

Running sfcbd-watchdog stop

sh: bad number

sh: you need to specify whom to kill

sfcbd is not running.

I've also tried esxcli system maintenanceMode set --enabled=true but I get nothing from that. If I tail -f hostd.log I get the following:

2013-01-09T13:07:01.805Z [343C2B90 verbose 'SoapAdapter'] Responded to service state request

2013-01-09T13:07:05.043Z [32D40B90 verbose 'ThreadPool'] usage : total=24 max=74 workrun=22 iorun=2 workQ=178 ioQ=0 maxrun=29 maxQ=180 cur=I

2013-01-09T13:07:05.043Z [32D40B90 verbose 'ThreadPool'] usage : total=24 max=74 workrun=22 iorun=2 workQ=178 ioQ=0 maxrun=29 maxQ=180 cur=I

2013-01-09T13:07:05.043Z [32D40B90 verbose 'Default'] CloseSession called for session-ec6d-8ec1-5bac-7b6991c2c0c6

Also I'm not sure if this is of any consequence but:

tail -f vpxa.log

2013-01-09T13:27:11.018Z [6F477B90 verbose 'vpxavpxaAlarm' opID=SWI-b21bd4db] [VpxaAlarm] VM with vmid = 23 not found

2013-01-09T13:27:21.019Z [6F477B90 verbose 'vpxavpxaMoVm' opID=SWI-b21bd4db] [VpxaMoVm::CheckMoVm] did not find a VM with ID 23 in the vmList

I'm also unable to connect to the host via the PowerCLI.

I've read that I could check to see if xinetd is running but so far I've been unsucessful in that aspect. So I'm a little stumped and any help greatly appreciated.

Thanks

0 Kudos
8 Replies
Gkeerthy
Expert
Expert

if you shutdown the esx host the HA will kick in... and the vms will migrated to another host... I am sure.

give moredetails about the vmkernel log... tail it... and see what is the error which you see when you try to start the vc agents

Please don't forget to award point for 'Correct' or 'Helpful', if you found the comment useful. (vExpert, VCP-Cloud. VCAP5-DCD, VCP4, VCP5, MCSE, MCITP)
0 Kudos
Gkeerthy
Expert
Expert

also refer the below links to restart the trouble shoot the agents

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100349...

Please don't forget to award point for 'Correct' or 'Helpful', if you found the comment useful. (vExpert, VCP-Cloud. VCAP5-DCD, VCP4, VCP5, MCSE, MCITP)
0 Kudos
Moif_Murphy
Enthusiast
Enthusiast

Thanks,

I've had a good look through all of those, nothing there has helped. I'm going to attempt an out of hours restart and look at reinstalling ESXi.

0 Kudos
spravtek
Expert
Expert

I know you're planning a restart to solve the issue...

But does the command stat -f filesystem_name give you anything?

0 Kudos
Moif_Murphy
Enthusiast
Enthusiast

Could you be more specific?

I tried:

/var/log # stat -f /vmfs/volumes/4ea55615-9746ca85-f8dd-001b24937f92/GFI.vmx

stat: can't read file system information for '/vmfs/volumes/4ea55615-9746ca85-f8dd-001b24937f92/GFI.vmx': No such file or directory

0 Kudos
spravtek
Expert
Expert

I was trying to refer to a KB I've read recently but couldn't find it at the time I wrote previous post ...

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=204070...

It just came up in my mind because you mentioned SNMP, but since your SNMP isn't really configured yet (if I don't misunderstand your post) it could be a long shot of course ...

0 Kudos
Moif_Murphy
Enthusiast
Enthusiast

Ahhh I see.

No you haven't misunderstood at all. I was configuring SNMP via SSH on the other hosts ( /etc/vmware/snmp.xml) and all was going swimmingly until I hit this particular host when things went south.

0 Kudos
spravtek
Expert
Expert

I remembered that command and the fact you could see something like Inodes: Total: 0 Free: 0

But now I've found the KB, so that's easier ...

Anyway, couldn't hurt to check it out, as I said, might be a long shot. Smiley Wink

0 Kudos