VMware Cloud Community
anujkatkar
Contributor
Contributor

watchdog-hostd process crashes after startup

Hi,

The below issue pertains to a standalone ESX server 3.0 which ash 2 vms running on it.

I have an issue with the mgmt-vmware startup. when I start the mgmt-vmware, I get the below /var/log/messages entries. I have checked the PID files vmware-hostd.PID and the watchdog-hostd.PID under /var/run/vmware and killed those processes before restarting mgmt-vmare.

Interestingly the watchdog-hostd and vmware-hostd process crash when I try connecting using a VI client.

I have through google and the vmware site, trying to dig up any relevant info, all the articles I came across don't mention this issue.

log entries on mgmt-vmware startup -->

Oct 11 10:10:58 GVT4-NSW watchdog-hostd: PID file /var/run/vmware/watchdog-hostd.PID not found
Oct 11 10:10:58 GVT4-NSW VMware[init]: [3055] Begin '/usr/sbin/vmware-hostd -u -a', min-uptime = 60, max-quick-failures = 5, max-to
tal-failures = 1000000
Oct 11 10:10:58 GVT4-NSW VMware[init]: connect: No such file or directory.
Oct 11 10:11:40 GVT4-NSW last message repeated 2 times
Oct 11 10:11:40 GVT4-NSW watchdog-hostd: Executing cleanup command '/usr/sbin/hostd-support'
Oct 11 10:11:40 GVT4-NSW VMware[init]: connect: No such file or directory.
Oct 11 10:12:00 GVT4-NSW VMware[init]: connect: No such file or directory.
Oct 11 10:12:00 GVT4-NSW watchdog-hostd: Executing cleanup command '/usr/sbin/hostd-support'
Oct 11 10:12:01 GVT4-NSW VMware[init]: connect: No such file or directory.
Oct 11 10:12:21 GVT4-NSW VMware[init]: connect: No such file or directory.
Oct 11 10:12:21 GVT4-NSW watchdog-hostd: Executing cleanup command '/usr/sbin/hostd-support'
Oct 11 10:12:21 GVT4-NSW VMware[init]: connect: No such file or directory.

log entries when trying to connect using VI client just before the processes crash -->

Oct 11 10:12:40 GVT4-NSW VMware[init]: connect: No such file or directory.
Oct 11 10:12:40 GVT4-NSW watchdog-hostd: Executing cleanup command '/usr/sbin/hostd-support'
Oct 11 10:12:41 GVT4-NSW VMware[init]: connect: No such file or directory.
Oct 11 10:13:00 GVT4-NSW watchdog-hostd: Executing cleanup command '/usr/sbin/hostd-support'
Oct 11 10:13:00 GVT4-NSW VMware[init]: connect: No such file or directory.
Oct 11 10:13:00 GVT4-NSW VMware[init]: connect: No such file or directory.
Oct 11 10:13:20 GVT4-NSW watchdog-hostd: Executing cleanup command '/usr/sbin/hostd-support'
Oct 11 10:13:20 GVT4-NSW VMware[init]: connect: No such file or directory.

Any ideas at all?

0 Kudos
2 Replies
anujkatkar
Contributor
Contributor

part of the hostd.log file -->

[2011-10-11 10:13:17.280 'Hostsvc::AutoStartManager' 3076448384 verbose] Powering on VM 16 with delay 120
[2011-10-11 10:13:17.282 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw1/ogw-3-vm.vmx' 86141872 verbose] Power on reques
t recieved
[2011-10-11 10:13:17.282 'Vmsvc' 86141872 verbose] Registered Foundry callback on 2
[2011-10-11 10:13:17.282 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw1/ogw-3-vm.vmx' 86141872 info] Adding task: haTas
k-16-vim.VirtualMachine.powerOn-0
[2011-10-11 10:13:17.282 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw1/ogw-3-vm.vmx' 86141872 verbose] VM State transi
tion requested to VM_STATE_POWERING_ON
[2011-10-11 10:13:17.282 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw1/ogw-3-vm.vmx' 86141872 verbose] Event generated
[2011-10-11 10:13:17.284 'ha-eventmgr' 86141872 info] Event 1 : ogw1 on  GVT4-NSW in ha-datacenter is resumed
[2011-10-11 10:13:17.284 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw1/ogw-3-vm.vmx' 86141872 error] Invalid transitio
n requested (VM_STATE_ON -> VM_STATE_POWERING_ON): Invalid state
[2011-10-11 10:13:17.285 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw1/ogw-3-vm.vmx' 86141872 info] Failed operation
[2011-10-11 10:13:17.285 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw1/ogw-3-vm.vmx' 86141872 verbose] Removing task:
haTask-16-vim.VirtualMachine.powerOn-0
[2011-10-11 10:13:17.285 'Vmomi' 86141872 info] Activation [N5Vmomi10ActivationE:0xa48cca8] : Invoke done [powerOn] on [vim.Virtual
Machine:16]
[2011-10-11 10:13:17.285 'Vmomi' 86141872 info] Throw vim.fault.InvalidPowerState
[2011-10-11 10:13:17.285 'Vmomi' 86141872 info] Result:
(vim.fault.InvalidPowerState) {
   requestedState = "poweredOn",
   existingState = "poweredOn"
   msg = ""
}
[2011-10-11 10:13:17.286 'Hostsvc::AutoStartManager' 3076448384 error] Error in initializing PowerOp for VM 16 : vim.fault.InvalidP
owerState
[2011-10-11 10:13:17.286 'Hostsvc::AutoStartManager' 3076448384 verbose] Powering on VM 32 with delay 120
[2011-10-11 10:13:17.287 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw2/ogw-3-vm.vmx' 109857712 verbose] Power on reque
st recieved
[2011-10-11 10:13:17.288 'Vmsvc' 109857712 verbose] Registered Foundry callback on 2
[2011-10-11 10:13:17.288 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw2/ogw-3-vm.vmx' 109857712 info] Adding task: haTa
sk-32-vim.VirtualMachine.powerOn-1
[2011-10-11 10:13:17.288 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw2/ogw-3-vm.vmx' 109857712 verbose] VM State trans
ition requested to VM_STATE_POWERING_ON
[2011-10-11 10:13:17.288 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw2/ogw-3-vm.vmx' 109857712 verbose] Event generate
d
[2011-10-11 10:13:17.289 'ha-eventmgr' 109857712 info] Event 2 : ogw2 on  GVT4-NSW in ha-datacenter is resumed
[2011-10-11 10:13:17.289 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw2/ogw-3-vm.vmx' 109857712 error] Invalid transiti
on requested (VM_STATE_ON -> VM_STATE_POWERING_ON): Invalid state
[2011-10-11 10:13:17.289 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw2/ogw-3-vm.vmx' 109857712 info] Failed operation
[2011-10-11 10:13:17.290 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw2/ogw-3-vm.vmx' 109857712 verbose] Removing task:
haTask-32-vim.VirtualMachine.powerOn-1
[2011-10-11 10:13:17.290 'Vmomi' 109857712 info] Activation [N5Vmomi10ActivationE:0xa48c818] : Invoke done [powerOn] on [vim.Virtua
lMachine:32]
[2011-10-11 10:13:17.290 'Vmomi' 109857712 info] Throw vim.fault.InvalidPowerState
[2011-10-11 10:13:17.290 'Vmomi' 109857712 info] Result:
(vim.fault.InvalidPowerState) {
   requestedState = "poweredOn",
   existingState = "poweredOn"
   msg = ""
}
[2011-10-11 10:13:17.290 'Hostsvc::AutoStartManager' 3076448384 error] Error in initializing PowerOp for VM 32 : vim.fault.InvalidP
owerState
[2011-10-11 10:13:17.291 'App' 3076448384 info] BEGIN SERVICES
[2011-10-11 10:13:20.301 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw1/ogw-3-vm.vmx' 18955184 verbose] Retrieved curre
nt power state from foundry 1
[2011-10-11 10:13:20.343 'vm:/vmfs/volumes/44ed40d9-6e21ace8-f8f5-00145e323622/ogw2/ogw-3-vm.vmx' 18955184 verbose] Retrieved curre
nt power state from foundry 1
[2011-10-11 10:13:20.351 'Vmsvc' 18955184 info] Vmsvc: Running vm count: 2 out of 2
[2011-10-11 10:13:20.351 'Statssvc' 18955184 verbose] Adding vm 16 to poweredOnVms list
[2011-10-11 10:13:20.351 'Statssvc' 18955184 verbose] Adding vm 32 to poweredOnVms list
[2011-10-11 10:13:20.375 'App' 18955184 error]

Exception: Assert failed!
[2011-10-11 10:13:20.376 'App' 18955184 error] Backtrace:
[00] eip 0x10cdd1e
[01] eip 0x1024d29
[02] eip 0xfdf295
[03] eip 0xfdf93e
[04] eip 0xfdf9f9
[05] eip 0xfdfac5
[06] eip 0x83a625e
[07] eip 0x837c997
[08] eip 0x837c397
[09] eip 0x837f60a
[10] eip 0x10dd472
[11] eip 0x10d5b08
[12] eip 0x10d9561
[13] eip 0x10dc012
[14] eip 0xca0dd8
[15] eip 0x78b1d1a

[2011-10-11 10:13:20.379 'App' 18955184 error] Backtrace:
[00] eip 0x10cdd1e
[01] eip 0x1024d29
[02] eip 0x10cd965
[03] eip 0xfdf94b
[04] eip 0xfdf9f9
[05] eip 0xfdfac5
[06] eip 0x83a625e
[07] eip 0x837c997
[08] eip 0x837c397
[09] eip 0x837f60a
[10] eip 0x10dd472
[11] eip 0x10d5b08
[12] eip 0x10d9561
[13] eip 0x10dc012
[14] eip 0xca0dd8
[15] eip 0x78b1d1a

0 Kudos
anujkatkar
Contributor
Contributor

the hostd.log does sate  the error

[2011-10-11 10:13:17.290 'Hostsvc::AutoStartManager' 3076448384 error]  Error in initializing PowerOp for VM 32 : vim.fault.InvalidP

but how do I fix that?

Cheers

Anuj

0 Kudos