VMware Cloud Community
JimKnopf99
Commander
Commander

ESXi disconnected

Hi,

i have a host that is not connected to the vCenter. The vm´s are running fine. I am not able to connect to it with a vSphere client as well.

What i doing so far is:

1. Check availability of the Server (ping, dns etc)

2. Restart management Network

3, Restart management agents

4. SSH Session to see if there is any storage issue. But no commands are working (like esxcli storage). I recieve the error message "Connect to localhost failed: Connection failure"

5. KB http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100340...

6. KB http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=101908...

In the vpdx log i see the some error messages:

warning 'VpxProfiler' opID=HB-host-21941@116475-472da3c0] [VpxdHostSync] GetChanges host:esx01.(10.0.0.172) [GetChangesTime] took 49233 ms
warning 'VpxProfiler' opID=HB-host-21941@116475-472da3c0] [VpxdHostSync] DoHostSync:0000000009A10BA0 [DoHostSyncTime] took 49233 ms
warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] DoHostSync failed for host host-21941
warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] Host sync failed to host-21941
warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHost::FixNotRespondingHost] Returning false since host is already fixed!
warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] Failed to fix not responding host host-21941
warning 'VpxProfiler' opID=task-internal-1-442dae5b-42] Time taken for DRS under LRO [VpxDrmRetrieveDomainConfigInfoGetInv] took 31 ms
warning 'VpxProfiler' opID=task-internal-1-442dae5b-42] Time taken for DRS under LRO [VpxDrmRetrieveDomainConfigInfoLockAcq2] took 31 ms
warning 'VpxProfiler' opID=task-internal-1-442dae5b-42] Time taken for DRS under LRO [CallingProposeActions] took 63 ms
warning 'VpxProfiler' opID=task-internal-1-442dae5b-42] Time taken for DRS under LRO [AskForDrmRecommendations] took 109 ms
warning 'VpxProfiler' opID=task-internal-1-442dae5b-42] Time taken for DRS under LRO [AskAndRefreshDrmRecommendations] took 187 ms

2012-07-26T09:46:15.959+02:00 [03940 error 'vmomi.soapStub[74]'] Resetting stub adapter for server TCP:esx01:443 : service state request failed: class Vmacore::Http::MalformedHeaderException(Remote server closed connection after 0 response bytes read)
2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vmodl.query.PropertyCollector.Filter:session[52575fa4-6855-7c25-1af2-946f645d0570]52d3c04c-cb3b-d69a-ef57-743a5d720680, method=destroy
2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=fetchQuickStats
2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
2012-07-26T09:46:15.959+02:00 [15572 warning 'VpxProfiler' opID=HB-host-21941@116475-472da3c0] ClientAdapterBase::InvokeOnSoap: (esx01.roland-domaene.intra, vmodl.query.PropertyCollector.Filter.destroy) [SoapRpcTime] took 39998 ms
2012-07-26T09:46:15.959+02:00 [15572 error 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdClientAdapter] Got vmacore exception: Operation was canceled
2012-07-26T09:46:15.959+02:00 [15572 error 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdClientAdapter] Backtrace:
--> backtrace[00] rip 000000018013deba (no symbol)
--> backtrace[01] rip 0000000180101518 (no symbol)
--> backtrace[02] rip 0000000180101a5e (no symbol)
--> backtrace[03] rip 000000018008930b (no symbol)
--> backtrace[04] rip 000000018003ef36 (no symbol)
--> backtrace[05] rip 0000000180046304 (no symbol)
--> backtrace[06] rip 00000000003dde06 (no symbol)
--> backtrace[07] rip 00000000003de68d (no symbol)
--> backtrace[08] rip 00000000003dfceb (no symbol)
--> backtrace[09] rip 00000000003e030f (no symbol)
--> backtrace[10] rip 000000018004426d (no symbol)
--> backtrace[11] rip 000000018004791e (no symbol)
--> backtrace[12] rip 000000018005d31e (no symbol)
--> backtrace[13] rip 000000018021a501 (no symbol)
--> backtrace[14] rip 0000000180119dac (no symbol)
--> backtrace[15] rip 000000018021a501 (no symbol)
--> backtrace[16] rip 0000000180161c9f (no symbol)
--> backtrace[17] rip 0000000180154d6c (no symbol)
--> backtrace[18] rip 0000000180154eed (no symbol)
--> backtrace[19] rip 0000000180155c70 (no symbol)
--> backtrace[20] rip 000000018014e575 (no symbol)
--> backtrace[21] rip 0000000072b52fdf (no symbol)
--> backtrace[22] rip 0000000072b53080 (no symbol)
--> backtrace[23] rip 000000007676652d (no symbol)
--> backtrace[24] rip 0000000076e5c521 (no symbol)
-->
2012-07-26T09:46:15.959+02:00 [15572 warning 'VpxProfiler' opID=HB-host-21941@116475-472da3c0] [VpxdHostSync] GetChanges host:esx01 (10.0.0.172) [GetChangesTime] took 90012 ms
2012-07-26T09:46:15.959+02:00 [15572 warning 'VpxProfiler' opID=HB-host-21941@116475-472da3c0] [VpxdHostSync] DoHostSync:0000000009A10BA0 [DoHostSyncTime] took 90012 ms
2012-07-26T09:46:15.959+02:00 [15572 warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] DoHostSync failed for host host-21941
2012-07-26T09:46:15.959+02:00 [15572 warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] Host sync failed to host-21941
2012-07-26T09:46:15.959+02:00 [15572 error 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] FixNotRespondingHost failed for host host-21941, marking host as notResponding
2012-07-26T09:46:15.974+02:00 [10044 error 'Default' opID=f8ce1d2a] (Log recursion level 2) Operation was canceled

Could it be a storage issue on that host? In the vmkernel.log i saw that messages:

Device naa.600a0b800011115500006f7d91d1a84b:1 detected to be a snapshot:
2012-07-26T08:40:41.050Z cpu12:6256373)LVM: 8452:   queried disk ID: <type 2, len 22, lun 7, devType 0, scsi 0, h(id) 11899336569075500378>
2012-07-26T08:40:41.050Z cpu12:6256373)LVM: 8459:   on-disk disk ID: <type 2, len 22, lun 7, devType 0, scsi 0, h(id) 10020257680356322126>
2012-07-26T08:40:41.050Z cpu12:6256373)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12

NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124012a06c0, 4626953) to dev "naa.600a0b8000111155000073ac74df1c4b" on path "vmhba3:C0:T3:L2" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2012-07-26T09:01:08.686Z cpu7:2055)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600a0b8000111155000073ac74df1c4b" state in doubt; requested fast path state update...

I have more of them. But when i want do see which lun is detected as storage with the command "esxcli storage vmfs snapshot list
" i recieve the error that i wrote on top (point 4)

There are servers running that i not wan´t to restart. If possible.

Thanks for any hints.

Frank

If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos
3 Replies
JimKnopf99
Commander
Commander

More bad messages found in vmkwarning

RNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600a0b8000111155000073ac74df1c4b" state in doubt; requested fast path state update...
2012-07-26T09:54:58.833Z cpu14:2062)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600a0b800011115500003a32a8f5374e" state in doubt; requested fast path state update...
2012-07-26T09:54:58.858Z cpu13:2061)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600a0b8000111155000026e34119fd4b" state in doubt; requested fast path state update...
2012-07-26T09:54:58.883Z cpu15:2063)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600a0b800011115500009bcd80fa1c4b" state in doubt; requested fast path state update...
2012-07-26T09:54:58.913Z cpu11:2059)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.5001438102d04f00" state in doubt; requested fast path state update...

And esxcli command not working on host ;-(

If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos
JimKnopf99
Commander
Commander

I don´t know why, but i restarted the management network over again and now i am able to reconnect the host.

But i wan´t to know why that happened. So if someone have an idea it will be appreciated

Frank

If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos