Hey,
we are running a ESXi 5.0 u1 cluster with two hosts and a shared iscsi storage.
at this morning one hosts stopps working and the hole host was in a very "strange state".
the host answers to a ping but we cannot connect with vsphere client and we cannot login on the gui...
After a hard-reset we brought the host up and running.
now i am looking in the VPXA.log and see a very strange error:
2014-05-08T09:14:24.178Z [57408B90 error 'Default'] [VpxaHalStatsHostagent::QueryHost] Did not get any entity metrics from the host, hence dropping result
2014-05-08T09:14:24.178Z [57408B90 verbose 'Default'] [PollCurrentStats] Skipping stat update due to stale sample from hostd.
2014-05-08T09:14:24.225Z [5730EB90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4574): Event:VpxaHalEvent::CheckQueuedEvents
2014-05-08T09:14:44.176Z [5730EB90 error 'Default'] [VpxaHalStatsHostagent::QueryHost] Did not get any entity metrics from the host, hence dropping result
2014-05-08T09:14:44.176Z [5730EB90 verbose 'Default'] [PollCurrentStats] Skipping stat update due to stale sample from hostd.
2014-05-08T09:15:24.229Z [572ABB90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4575): Event:VpxaHalEvent::CheckQueuedEvents
2014-05-08T09:16:30.867Z [5744AB90 verbose 'VpxProfiler'] [1+] CheckEnvBrowserChanges
2014-05-08T09:17:24.235Z [572CCB90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4577): Event:VpxaHalEvent::CheckQueuedEvents
2014-05-08T09:45:04.269Z [FFC02780 error 'Default'] [VpxaClientAdapter::InvokeCommon] Got exception while invoking queryStats on vim.PerformanceManager:ha-perfmgr: 'Operation timed out', backtrace:
--> [00] rip 1c8aeb83
--> [01] rip 1c706f2e
--> [02] rip 1c69d662
--> [03] rip 1c760abb
--> [04] rip 1c761149
--> [05] rip 1cd08de5
--> [06] rip 1cd09719
--> [07] rip 092cdb33
--> [08] rip 092cdfa9
--> [09] rip 1cd3d7d1
--> [10] rip 0a701aca
--> [11] rip 094ba095
--> [12] rip 094c527c
--> [13] rip 094c5e20
--> [14] rip 094a5eb8
--> [15] rip 09254b5c
--> [16] rip 09585020
--> [17] rip 095853cb
--> [18] rip 09580542
--> [19] rip 1c8d2c08
--> [20] rip 1c8d2ce4
--> [21] rip 1c8cc533
--> [22] rip 1c8ccd3a
--> [23] rip 1c6ba32b
--> [24] rip 09256a84
--> [25] rip 0942eb87
--> [26] rip 1d7f9efc
--> [27] rip 09253291
-->
2014-05-08T09:45:04.271Z [FFC02780 error 'Default'] [PollCurrentStats] Failed to fetch current stats. Fault:
2014-05-08T09:45:04.271Z [FFC02780 warning 'VpxProfiler'] VpxUtil_InvokeWithOpId [TotalTime] took 1800114 ms
2014-05-08T09:45:24.330Z [FFCA8B90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4605): Event:VpxaHalEvent::CheckQueuedEvents
2014-05-08T09:46:30.977Z [5744AB90 error 'Default'] [VpxaClientAdapter::InvokeCommon] Got exception while invoking GetEnvironmentBrowser on vim.ComputeResource:ha-compute-res: 'Operation timed out', backtrace:
--> [00] rip 1c8aeb83
--> [01] rip 1c706f2e
--> [02] rip 1c69d662
--> [03] rip 1c760abb
--> [04] rip 1c761149
--> [05] rip 1cd08de5
--> [06] rip 1cd09719
--> [07] rip 092cdb33
--> [08] rip 092cdfa9
--> [09] rip 1cd3d7d1
--> [10] rip 0a6d8f58
--> [11] rip 09490be2
--> [12] rip 09490efe
--> [13] rip 0937b225
--> [14] rip 093a48d9
--> [15] rip 0936f737
--> [16] rip 09280928
--> [17] rip 092809a8
--> [18] rip 0927e88d
--> [19] rip 09254b5c
--> [20] rip 09585020
--> [21] rip 095853cb
--> [22] rip 09580542
--> [23] rip 1c8d2c08
--> [24] rip 1c8d2ce4
--> [25] rip 1c8cc533
--> [26] rip 1c8cd0d8
--> [27] rip 09254b5c
--> [28] rip 1c8c1679
--> [29] rip 1c39e852
--> [30] rip 1d8b84ce
-->
2014-05-08T09:46:30.978Z [5744AB90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4607): configStatus:vpxa issue posted
2014-05-08T09:46:30.978Z [5744AB90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4608): Event:vpxa issue posted
2014-05-08T09:46:30.978Z [5744AB90 error 'Default'] [VpxaMoService] Exception vmodl.fault.HostCommunication while fetching EnvBrowser info [00] rip 1c8aeb83
--> [01] rip 1c706f2e
--> [02] rip 1c69d662
--> [03] rip 092cdbef
--> [04] rip 092cdfa9
--> [05] rip 1cd3d7d1
--> [06] rip 0a6d8f58
--> [07] rip 09490be2
--> [08] rip 09490efe
--> [09] rip 0937b225
--> [10] rip 093a48d9
--> [11] rip 0936f737
--> [12] rip 09280928
--> [13] rip 092809a8
--> [14] rip 0927e88d
--> [15] rip 09254b5c
--> [16] rip 09585020
--> [17] rip 095853cb
--> [18] rip 09580542
--> [19] rip 1c8d2c08
--> [20] rip 1c8d2ce4
--> [21] rip 1c8cc533
--> [22] rip 1c8cd0d8
--> [23] rip 09254b5c
--> [24] rip 1c8c1679
--> [25] rip 1c39e852
--> [26] rip 1d8b84ce
-->
2014-05-08T09:46:30.979Z [5744AB90 verbose 'VpxProfiler'] [1-] CheckEnvBrowserChanges (took 1800112 ms)
2014-05-08T09:46:30.979Z [5744AB90 warning 'VpxProfiler'] VpxUtil_InvokeWithOpId [TotalTime] took 1800112 ms
2014-05-08T09:55:58.712Z [57383B90 error 'Default'] [VpxaClientAdapter::InvokeCommon] Got exception while invoking refresh on vim.Datastore:5289e0b9-040ffdf4-ca26-c4641338ed98: 'Operation timed out', backtrace:
--> [00] rip 1c8aeb83
--> [01] rip 1c706f2e
--> [02] rip 1c69d662
--> [03] rip 1c760abb
--> [04] rip 1c761149
--> [05] rip 1cd08de5
--> [06] rip 1cd09719
--> [07] rip 092cdb33
--> [08] rip 092cdfa9
--> [09] rip 1cd3d7d1
--> [10] rip 0a6d7cc4
--> [11] rip 09516e4b
--> [12] rip 09292294
--> [13] rip 09292eb5
--> [14] rip 0927e83b
--> [15] rip 09254b5c
--> [16] rip 09585020
--> [17] rip 095853cb
--> [18] rip 09580542
--> [19] rip 1c8d2c08
--> [20] rip 1c8d2ce4
--> [21] rip 1c8cc533
--> [22] rip 1c8cd0d8
--> [23] rip 09254b5c
--> [24] rip 1c8c1679
--> [25] rip 1c39e852
--> [26] rip 1d8b84ce
-->
2014-05-08T09:55:58.714Z [57383B90 error 'Default'] Exception during datastore refresh: vmodl.fault.HostCommunication
2014-05-08T09:55:58.714Z [57383B90 warning 'VpxProfiler'] VpxUtil_InvokeWithOpId [TotalTime] took 1800112 ms
2014-05-08T09:56:24.377Z [57383B90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4618): Event:VpxaHalEvent::CheckQueuedEvents
2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 2 push into vmlist
2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 3 push into vmlist
2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 4 push into vmlist
2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 5 push into vmlist
2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 6 push into vmlist
2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 7 push into vmlist
2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 8 push into vmlist
2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 9 push into vmlist
2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaInvtVm_ScheduleVmSpaceRefresh] Refreshing 8 VMs
2014-05-08T10:13:24.435Z [57429B90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4635): Event:VpxaHalEvent::CheckQueuedEvents
2014-05-08T10:14:24.438Z [573A4B90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4636): Event:VpxaHalEvent::CheckQueuedEvents
2014-05-08T10:15:04.384Z [FFC02780 error 'Default'] [VpxaClientAdapter::InvokeCommon] Got exception while invoking queryStats on vim.PerformanceManager:ha-perfmgr: 'Operation timed out', backtrace:
--> [00] rip 1c8aeb83
--> [01] rip 1c706f2e
--> [02] rip 1c69d662
--> [03] rip 1c760abb
--> [04] rip 1c761149
--> [05] rip 1cd08de5
--> [06] rip 1cd09719
--> [07] rip 092cdb33
--> [08] rip 092cdfa9
--> [09] rip 1cd3d7d1
--> [10] rip 0a701aca
--> [11] rip 094ba095
--> [12] rip 094c527c
--> [13] rip 094c5e20
--> [14] rip 094a5eb8
--> [15] rip 09254b5c
--> [16] rip 09585020
--> [17] rip 095853cb
--> [18] rip 09580542
--> [19] rip 1c8d2c08
--> [20] rip 1c8d2ce4
--> [21] rip 1c8cc533
--> [22] rip 1c8ccd3a
--> [23] rip 1c6ba32b
--> [24] rip 09256a84
--> [25] rip 0942eb87
--> [26] rip 1d7f9efc
--> [27] rip 09253291
-->
2014-05-08T10:15:04.385Z [FFC02780 error 'Default'] [PollCurrentStats] Failed to fetch current stats. Fault:
2014-05-08T10:15:04.385Z [FFC02780 warning 'VpxProfiler'] VpxUtil_InvokeWithOpId [TotalTime] took 1800114 ms
cound anybody say whats wrong here?
what are those "RIP...." messages? i have never seen them bevore...
could this be a hardware error?
best regards
Bernd
Hello,
messages "rip" are the error codes.
To help you please upload the logs ESXi failed to analyze them.
The ESX is patched to the last level? The drivers are the last level?
Hello,
Reviewing the logs I have seen that the problem comes from the configuration of ISCSI:
iscsi_vmk: iscsivmk_ConnNetRegister: socket 0x410026fd5510 network resource pool netsched.pools.persist.iscsi associated
2014-05-04T22:30:10.250Z cpu3:4803)iscsi_vmk: iscsivmk_ConnNetRegister: socket 0x410026fd5510 network tracker id 1 tracker.iSCSI.172.21.7.11 associated
2014-05-04T22:30:10.251Z cpu3:4803)WARNING: iscsi_vmk: iscsivmk_ConnReceiveAtomic: vmhba41:CH:0 T:0 CN:0: Failed to receive data: Connection closed by peer
2014-05-04T22:30:10.251Z cpu3:4803)WARNING: iscsi_vmk: iscsivmk_ConnReceiveAtomic: Sess [ISID: TARGET: (null) TPGT: 0 TSIH: 0]
2014-05-04T22:30:10.251Z cpu3:4803)WARNING: iscsi_vmk: iscsivmk_ConnReceiveAtomic: Conn [CID: 0 L: 172.21.6.11:51422 R: 172.21.7.11:3260]
2014-05-04T22:30:10.251Z cpu3:4803)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: vmhba41:CH:0 T:0 CN:0: Connection rx notifying failure: Failed to Receive. State=Bound
2014-05-04T22:30:10.251Z cpu3:4803)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: Sess [ISID: TARGET: (null) TPGT: 0 TSIH: 0]
2014-05-04T22:30:10.251Z cpu3:4803)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: Conn [CID: 0 L: 172.21.6.11:51422 R: 172.21.7.11:3260]
2014-05-04T22:30:10.508Z cpu16:4803)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba41:CH:0 T:0 CN:0: iSCSI connection is being marked "OFFLINE" (Event:2)
2014-05-04T22:30:10.508Z cpu16:4803)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess [ISID: TARGET: (null) TPGT: 0 TSIH: 0]
2014-05-04T22:30:10.508Z cpu16:4803)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn [CID: 0 L: 172.21.6.11:51422 R: 172.21.7.11:3260]
Please apply the following KBs:
Configuring and troubleshooting basic software iSCSI setup (1008083)
Unable to see the shared LUN presented from the EqualLogic iSCSI array for VMware ESXi (1016381)
ESX/ESXi hosts randomly drop and reconnect iSCSI connections to an EqualLogic array (2004432)
Troubleshooting iSCSI LUN connectivity issues on ESX/ESXi hosts (1003681)
Adding iSCSI storage to UCS and vSPhere
http://benking84.wordpress.com/2013/06/05/adding-iscsi-storage-to-ucs-and-vmware/
I recommend you install the ISO Esxi 5.0 U3 CISCO custom as it comes with all the drivers preinstalled and preconfigured:
http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/release/notes/OL_26617.html
