3 Replies Latest reply on May 17, 2016 7:51 AM by kconway858

    ESXi disconnected

    JimKnopf99 Master

      Hi,

       

      i have a host that is not connected to the vCenter. The vm´s are running fine. I am not able to connect to it with a vSphere client as well.

       

      What i doing so far is:

       

      1. Check availability of the Server (ping, dns etc)

      2. Restart management Network

      3, Restart management agents

      4. SSH Session to see if there is any storage issue. But no commands are working (like esxcli storage). I recieve the error message "Connect to localhost failed: Connection failure"

      5. KB http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1003409

      6. KB http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1019082

       

      In the vpdx log i see the some error messages:

       

      warning 'VpxProfiler' opID=HB-host-21941@116475-472da3c0] [VpxdHostSync] GetChanges host:esx01.(10.0.0.172) [GetChangesTime] took 49233 ms
      warning 'VpxProfiler' opID=HB-host-21941@116475-472da3c0] [VpxdHostSync] DoHostSync:0000000009A10BA0 [DoHostSyncTime] took 49233 ms
      warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] DoHostSync failed for host host-21941
      warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] Host sync failed to host-21941
      warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHost::FixNotRespondingHost] Returning false since host is already fixed!
      warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] Failed to fix not responding host host-21941
      warning 'VpxProfiler' opID=task-internal-1-442dae5b-42] Time taken for DRS under LRO [VpxDrmRetrieveDomainConfigInfoGetInv] took 31 ms
      warning 'VpxProfiler' opID=task-internal-1-442dae5b-42] Time taken for DRS under LRO [VpxDrmRetrieveDomainConfigInfoLockAcq2] took 31 ms
      warning 'VpxProfiler' opID=task-internal-1-442dae5b-42] Time taken for DRS under LRO [CallingProposeActions] took 63 ms
      warning 'VpxProfiler' opID=task-internal-1-442dae5b-42] Time taken for DRS under LRO [AskForDrmRecommendations] took 109 ms
      warning 'VpxProfiler' opID=task-internal-1-442dae5b-42] Time taken for DRS under LRO [AskAndRefreshDrmRecommendations] took 187 ms

       

      2012-07-26T09:46:15.959+02:00 [03940 error 'vmomi.soapStub[74]'] Resetting stub adapter for server TCP:esx01:443 : service state request failed: class Vmacore::Http::MalformedHeaderException(Remote server closed connection after 0 response bytes read)
      2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
      2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
      2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vmodl.query.PropertyCollector.Filter:session[52575fa4-6855-7c25-1af2-946f645d0570]52d3c04c-cb3b-d69a-ef57-743a5d720680, method=destroy
      2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
      2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
      2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=fetchQuickStats
      2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
      2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
      2012-07-26T09:46:15.959+02:00 [03940 warning 'vmomi.soapStub[74]'] Terminating invocation: server=TCP:esx01:443, moref=vpxapi.VpxaService:vpxa, method=queryBatchPerformanceStatistics
      2012-07-26T09:46:15.959+02:00 [15572 warning 'VpxProfiler' opID=HB-host-21941@116475-472da3c0] ClientAdapterBase::InvokeOnSoap: (esx01.roland-domaene.intra, vmodl.query.PropertyCollector.Filter.destroy) [SoapRpcTime] took 39998 ms
      2012-07-26T09:46:15.959+02:00 [15572 error 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdClientAdapter] Got vmacore exception: Operation was canceled
      2012-07-26T09:46:15.959+02:00 [15572 error 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdClientAdapter] Backtrace:
      --> backtrace[00] rip 000000018013deba (no symbol)
      --> backtrace[01] rip 0000000180101518 (no symbol)
      --> backtrace[02] rip 0000000180101a5e (no symbol)
      --> backtrace[03] rip 000000018008930b (no symbol)
      --> backtrace[04] rip 000000018003ef36 (no symbol)
      --> backtrace[05] rip 0000000180046304 (no symbol)
      --> backtrace[06] rip 00000000003dde06 (no symbol)
      --> backtrace[07] rip 00000000003de68d (no symbol)
      --> backtrace[08] rip 00000000003dfceb (no symbol)
      --> backtrace[09] rip 00000000003e030f (no symbol)
      --> backtrace[10] rip 000000018004426d (no symbol)
      --> backtrace[11] rip 000000018004791e (no symbol)
      --> backtrace[12] rip 000000018005d31e (no symbol)
      --> backtrace[13] rip 000000018021a501 (no symbol)
      --> backtrace[14] rip 0000000180119dac (no symbol)
      --> backtrace[15] rip 000000018021a501 (no symbol)
      --> backtrace[16] rip 0000000180161c9f (no symbol)
      --> backtrace[17] rip 0000000180154d6c (no symbol)
      --> backtrace[18] rip 0000000180154eed (no symbol)
      --> backtrace[19] rip 0000000180155c70 (no symbol)
      --> backtrace[20] rip 000000018014e575 (no symbol)
      --> backtrace[21] rip 0000000072b52fdf (no symbol)
      --> backtrace[22] rip 0000000072b53080 (no symbol)
      --> backtrace[23] rip 000000007676652d (no symbol)
      --> backtrace[24] rip 0000000076e5c521 (no symbol)
      -->
      2012-07-26T09:46:15.959+02:00 [15572 warning 'VpxProfiler' opID=HB-host-21941@116475-472da3c0] [VpxdHostSync] GetChanges host:esx01 (10.0.0.172) [GetChangesTime] took 90012 ms
      2012-07-26T09:46:15.959+02:00 [15572 warning 'VpxProfiler' opID=HB-host-21941@116475-472da3c0] [VpxdHostSync] DoHostSync:0000000009A10BA0 [DoHostSyncTime] took 90012 ms
      2012-07-26T09:46:15.959+02:00 [15572 warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] DoHostSync failed for host host-21941
      2012-07-26T09:46:15.959+02:00 [15572 warning 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] Host sync failed to host-21941
      2012-07-26T09:46:15.959+02:00 [15572 error 'Default' opID=HB-host-21941@116475-472da3c0] [VpxdInvtHostSyncHostLRO] FixNotRespondingHost failed for host host-21941, marking host as notResponding
      2012-07-26T09:46:15.974+02:00 [10044 error 'Default' opID=f8ce1d2a] (Log recursion level 2) Operation was canceled

       

      Could it be a storage issue on that host? In the vmkernel.log i saw that messages:

       

      Device naa.600a0b800011115500006f7d91d1a84b:1 detected to be a snapshot:
      2012-07-26T08:40:41.050Z cpu12:6256373)LVM: 8452:   queried disk ID: <type 2, len 22, lun 7, devType 0, scsi 0, h(id) 11899336569075500378>
      2012-07-26T08:40:41.050Z cpu12:6256373)LVM: 8459:   on-disk disk ID: <type 2, len 22, lun 7, devType 0, scsi 0, h(id) 10020257680356322126>
      2012-07-26T08:40:41.050Z cpu12:6256373)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12

       

       

      NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124012a06c0, 4626953) to dev "naa.600a0b8000111155000073ac74df1c4b" on path "vmhba3:C0:T3:L2" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
      2012-07-26T09:01:08.686Z cpu7:2055)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600a0b8000111155000073ac74df1c4b" state in doubt; requested fast path state update...

       

       

      I have more of them. But when i want do see which lun is detected as storage with the command "esxcli storage vmfs snapshot list
      " i recieve the error that i wrote on top (point 4)

       

      There are servers running that i not wan´t to restart. If possible.

       

      Thanks for any hints.

      Frank

        • 1. Re: ESXi disconnected
          JimKnopf99 Master

          More bad messages found in vmkwarning

           

          RNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600a0b8000111155000073ac74df1c4b" state in doubt; requested fast path state update...
          2012-07-26T09:54:58.833Z cpu14:2062)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600a0b800011115500003a32a8f5374e" state in doubt; requested fast path state update...
          2012-07-26T09:54:58.858Z cpu13:2061)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600a0b8000111155000026e34119fd4b" state in doubt; requested fast path state update...
          2012-07-26T09:54:58.883Z cpu15:2063)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600a0b800011115500009bcd80fa1c4b" state in doubt; requested fast path state update...
          2012-07-26T09:54:58.913Z cpu11:2059)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.5001438102d04f00" state in doubt; requested fast path state update...

           

          And esxcli command not working on host ;-(

          • 2. Re: ESXi disconnected
            JimKnopf99 Master

            I don´t know why, but i restarted the management network over again and now i am able to reconnect the host.

            But i wan´t to know why that happened. So if someone have an idea it will be appreciated

             

            Frank