VMware Cloud Community
berndmaier
Contributor
Contributor

Error with ESXi Host - Version 5.0 u1 - Hardware problem?

Hey,

we are running a ESXi 5.0 u1 cluster with two hosts and a shared iscsi storage.

at this morning one hosts stopps working and the hole host was in a very "strange state".

the host answers to a ping but we cannot connect with vsphere client and we cannot login on the gui...

After a hard-reset we brought the host up and running.

now i am looking in the VPXA.log and see a very strange error:

2014-05-08T09:14:24.178Z [57408B90 error 'Default'] [VpxaHalStatsHostagent::QueryHost] Did not get any entity metrics from the host, hence dropping result

2014-05-08T09:14:24.178Z [57408B90 verbose 'Default'] [PollCurrentStats] Skipping stat update due to stale sample from hostd.

2014-05-08T09:14:24.225Z [5730EB90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4574): Event:VpxaHalEvent::CheckQueuedEvents

2014-05-08T09:14:44.176Z [5730EB90 error 'Default'] [VpxaHalStatsHostagent::QueryHost] Did not get any entity metrics from the host, hence dropping result

2014-05-08T09:14:44.176Z [5730EB90 verbose 'Default'] [PollCurrentStats] Skipping stat update due to stale sample from hostd.

2014-05-08T09:15:24.229Z [572ABB90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4575): Event:VpxaHalEvent::CheckQueuedEvents

2014-05-08T09:16:30.867Z [5744AB90 verbose 'VpxProfiler'] [1+] CheckEnvBrowserChanges

2014-05-08T09:17:24.235Z [572CCB90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4577): Event:VpxaHalEvent::CheckQueuedEvents

2014-05-08T09:45:04.269Z [FFC02780 error 'Default'] [VpxaClientAdapter::InvokeCommon] Got exception while invoking queryStats on vim.PerformanceManager:ha-perfmgr: 'Operation timed out', backtrace:

--> [00] rip 1c8aeb83

--> [01] rip 1c706f2e

--> [02] rip 1c69d662

--> [03] rip 1c760abb

--> [04] rip 1c761149

--> [05] rip 1cd08de5

--> [06] rip 1cd09719

--> [07] rip 092cdb33

--> [08] rip 092cdfa9

--> [09] rip 1cd3d7d1

--> [10] rip 0a701aca

--> [11] rip 094ba095

--> [12] rip 094c527c

--> [13] rip 094c5e20

--> [14] rip 094a5eb8

--> [15] rip 09254b5c

--> [16] rip 09585020

--> [17] rip 095853cb

--> [18] rip 09580542

--> [19] rip 1c8d2c08

--> [20] rip 1c8d2ce4

--> [21] rip 1c8cc533

--> [22] rip 1c8ccd3a

--> [23] rip 1c6ba32b

--> [24] rip 09256a84

--> [25] rip 0942eb87

--> [26] rip 1d7f9efc

--> [27] rip 09253291

-->

2014-05-08T09:45:04.271Z [FFC02780 error 'Default'] [PollCurrentStats] Failed to fetch current stats.  Fault:

2014-05-08T09:45:04.271Z [FFC02780 warning 'VpxProfiler'] VpxUtil_InvokeWithOpId [TotalTime] took 1800114 ms

2014-05-08T09:45:24.330Z [FFCA8B90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4605): Event:VpxaHalEvent::CheckQueuedEvents

2014-05-08T09:46:30.977Z [5744AB90 error 'Default'] [VpxaClientAdapter::InvokeCommon] Got exception while invoking GetEnvironmentBrowser on vim.ComputeResource:ha-compute-res: 'Operation timed out', backtrace:

--> [00] rip 1c8aeb83

--> [01] rip 1c706f2e

--> [02] rip 1c69d662

--> [03] rip 1c760abb

--> [04] rip 1c761149

--> [05] rip 1cd08de5

--> [06] rip 1cd09719

--> [07] rip 092cdb33

--> [08] rip 092cdfa9

--> [09] rip 1cd3d7d1

--> [10] rip 0a6d8f58

--> [11] rip 09490be2

--> [12] rip 09490efe

--> [13] rip 0937b225

--> [14] rip 093a48d9

--> [15] rip 0936f737

--> [16] rip 09280928

--> [17] rip 092809a8

--> [18] rip 0927e88d

--> [19] rip 09254b5c

--> [20] rip 09585020

--> [21] rip 095853cb

--> [22] rip 09580542

--> [23] rip 1c8d2c08

--> [24] rip 1c8d2ce4

--> [25] rip 1c8cc533

--> [26] rip 1c8cd0d8

--> [27] rip 09254b5c

--> [28] rip 1c8c1679

--> [29] rip 1c39e852

--> [30] rip 1d8b84ce

-->

2014-05-08T09:46:30.978Z [5744AB90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4607): configStatus:vpxa issue posted

2014-05-08T09:46:30.978Z [5744AB90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4608): Event:vpxa issue posted

2014-05-08T09:46:30.978Z [5744AB90 error 'Default'] [VpxaMoService] Exception vmodl.fault.HostCommunication while fetching EnvBrowser info [00] rip 1c8aeb83

--> [01] rip 1c706f2e

--> [02] rip 1c69d662

--> [03] rip 092cdbef

--> [04] rip 092cdfa9

--> [05] rip 1cd3d7d1

--> [06] rip 0a6d8f58

--> [07] rip 09490be2

--> [08] rip 09490efe

--> [09] rip 0937b225

--> [10] rip 093a48d9

--> [11] rip 0936f737

--> [12] rip 09280928

--> [13] rip 092809a8

--> [14] rip 0927e88d

--> [15] rip 09254b5c

--> [16] rip 09585020

--> [17] rip 095853cb

--> [18] rip 09580542

--> [19] rip 1c8d2c08

--> [20] rip 1c8d2ce4

--> [21] rip 1c8cc533

--> [22] rip 1c8cd0d8

--> [23] rip 09254b5c

--> [24] rip 1c8c1679

--> [25] rip 1c39e852

--> [26] rip 1d8b84ce

-->

2014-05-08T09:46:30.979Z [5744AB90 verbose 'VpxProfiler'] [1-] CheckEnvBrowserChanges (took 1800112 ms)

2014-05-08T09:46:30.979Z [5744AB90 warning 'VpxProfiler'] VpxUtil_InvokeWithOpId [TotalTime] took 1800112 ms

2014-05-08T09:55:58.712Z [57383B90 error 'Default'] [VpxaClientAdapter::InvokeCommon] Got exception while invoking refresh on vim.Datastore:5289e0b9-040ffdf4-ca26-c4641338ed98: 'Operation timed out', backtrace:

--> [00] rip 1c8aeb83

--> [01] rip 1c706f2e

--> [02] rip 1c69d662

--> [03] rip 1c760abb

--> [04] rip 1c761149

--> [05] rip 1cd08de5

--> [06] rip 1cd09719

--> [07] rip 092cdb33

--> [08] rip 092cdfa9

--> [09] rip 1cd3d7d1

--> [10] rip 0a6d7cc4

--> [11] rip 09516e4b

--> [12] rip 09292294

--> [13] rip 09292eb5

--> [14] rip 0927e83b

--> [15] rip 09254b5c

--> [16] rip 09585020

--> [17] rip 095853cb

--> [18] rip 09580542

--> [19] rip 1c8d2c08

--> [20] rip 1c8d2ce4

--> [21] rip 1c8cc533

--> [22] rip 1c8cd0d8

--> [23] rip 09254b5c

--> [24] rip 1c8c1679

--> [25] rip 1c39e852

--> [26] rip 1d8b84ce

-->

2014-05-08T09:55:58.714Z [57383B90 error 'Default'] Exception during datastore  refresh: vmodl.fault.HostCommunication

2014-05-08T09:55:58.714Z [57383B90 warning 'VpxProfiler'] VpxUtil_InvokeWithOpId [TotalTime] took 1800112 ms

2014-05-08T09:56:24.377Z [57383B90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4618): Event:VpxaHalEvent::CheckQueuedEvents

2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 2 push into vmlist

2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 3 push into vmlist

2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 4 push into vmlist

2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 5 push into vmlist

2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 6 push into vmlist

2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 7 push into vmlist

2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 8 push into vmlist

2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaMoVm::GetVmList] vm 9 push into vmlist

2014-05-08T10:13:02.925Z [FFC87B90 verbose 'Default'] [VpxaInvtVm_ScheduleVmSpaceRefresh] Refreshing 8 VMs

2014-05-08T10:13:24.435Z [57429B90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4635): Event:VpxaHalEvent::CheckQueuedEvents

2014-05-08T10:14:24.438Z [573A4B90 verbose 'Default'] [VpxaInvtHost] Increment master gen. no to (4636): Event:VpxaHalEvent::CheckQueuedEvents

2014-05-08T10:15:04.384Z [FFC02780 error 'Default'] [VpxaClientAdapter::InvokeCommon] Got exception while invoking queryStats on vim.PerformanceManager:ha-perfmgr: 'Operation timed out', backtrace:

--> [00] rip 1c8aeb83

--> [01] rip 1c706f2e

--> [02] rip 1c69d662

--> [03] rip 1c760abb

--> [04] rip 1c761149

--> [05] rip 1cd08de5

--> [06] rip 1cd09719

--> [07] rip 092cdb33

--> [08] rip 092cdfa9

--> [09] rip 1cd3d7d1

--> [10] rip 0a701aca

--> [11] rip 094ba095

--> [12] rip 094c527c

--> [13] rip 094c5e20

--> [14] rip 094a5eb8

--> [15] rip 09254b5c

--> [16] rip 09585020

--> [17] rip 095853cb

--> [18] rip 09580542

--> [19] rip 1c8d2c08

--> [20] rip 1c8d2ce4

--> [21] rip 1c8cc533

--> [22] rip 1c8ccd3a

--> [23] rip 1c6ba32b

--> [24] rip 09256a84

--> [25] rip 0942eb87

--> [26] rip 1d7f9efc

--> [27] rip 09253291

-->

2014-05-08T10:15:04.385Z [FFC02780 error 'Default'] [PollCurrentStats] Failed to fetch current stats.  Fault:

2014-05-08T10:15:04.385Z [FFC02780 warning 'VpxProfiler'] VpxUtil_InvokeWithOpId [TotalTime] took 1800114 ms

cound anybody say whats wrong here?

what are those "RIP...." messages? i have never seen them bevore...

could this be a hardware error?

best regards

Bernd

Reply
0 Kudos
3 Replies
DanielOprea
Hot Shot
Hot Shot

Hello,

messages "rip" are the error codes.

To help you please upload the logs ESXi failed to analyze them.

The ESX is patched to the last level? The drivers are the last level?

PLEASE CONSIDER AWARDING any HELPFUL or CORRECT answer. Thanks!!
Por favor CONSIDERA PREMIAR cualquier respuesta ÚTIL o CORRECTA. ¡¡Muchas gracias!!
Blogs: https://danieloprea.blogspot.com/
Reply
0 Kudos
berndmaier
Contributor
Contributor

Hey,

thanks for your reply.

no we have only installed "update1" for 5.0.0

and also no, we dont made any driver-upgrades

the hardware is a cisco ucs c250m2

attached i have the logfiles in a zip-file

best regards,

bernd

Reply
0 Kudos
DanielOprea
Hot Shot
Hot Shot

Hello,

Reviewing the logs I have seen that the problem comes from the configuration of ISCSI:

iscsi_vmk: iscsivmk_ConnNetRegister: socket 0x410026fd5510 network resource pool netsched.pools.persist.iscsi associated

2014-05-04T22:30:10.250Z cpu3:4803)iscsi_vmk: iscsivmk_ConnNetRegister: socket 0x410026fd5510 network tracker id 1 tracker.iSCSI.172.21.7.11 associated

2014-05-04T22:30:10.251Z cpu3:4803)WARNING: iscsi_vmk: iscsivmk_ConnReceiveAtomic: vmhba41:CH:0 T:0 CN:0: Failed to receive data: Connection closed by peer

2014-05-04T22:30:10.251Z cpu3:4803)WARNING: iscsi_vmk: iscsivmk_ConnReceiveAtomic: Sess [ISID: TARGET: (null) TPGT: 0 TSIH: 0]

2014-05-04T22:30:10.251Z cpu3:4803)WARNING: iscsi_vmk: iscsivmk_ConnReceiveAtomic: Conn [CID: 0 L: 172.21.6.11:51422 R: 172.21.7.11:3260]

2014-05-04T22:30:10.251Z cpu3:4803)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: vmhba41:CH:0 T:0 CN:0: Connection rx notifying failure: Failed to Receive. State=Bound

2014-05-04T22:30:10.251Z cpu3:4803)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: Sess [ISID: TARGET: (null) TPGT: 0 TSIH: 0]

2014-05-04T22:30:10.251Z cpu3:4803)iscsi_vmk: iscsivmk_ConnRxNotifyFailure: Conn [CID: 0 L: 172.21.6.11:51422 R: 172.21.7.11:3260]

2014-05-04T22:30:10.508Z cpu16:4803)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba41:CH:0 T:0 CN:0: iSCSI connection is being marked "OFFLINE" (Event:2)

2014-05-04T22:30:10.508Z cpu16:4803)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess [ISID: TARGET: (null) TPGT: 0 TSIH: 0]

2014-05-04T22:30:10.508Z cpu16:4803)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn [CID: 0 L: 172.21.6.11:51422 R: 172.21.7.11:3260]


Please apply the following KBs:

Configuring and troubleshooting basic software iSCSI setup (1008083)

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100808...

Unable to see the shared LUN presented from the EqualLogic iSCSI array for VMware ESXi (1016381)

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=101638...

ESX/ESXi hosts randomly drop and reconnect iSCSI connections to an EqualLogic array (2004432)

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=200443...

Troubleshooting iSCSI LUN connectivity issues on ESX/ESXi hosts (1003681)

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100368...

Adding iSCSI storage to UCS and vSPhere

http://benking84.wordpress.com/2013/06/05/adding-iscsi-storage-to-ucs-and-vmware/

I recommend you install the ISO Esxi 5.0 U3 CISCO custom as it comes with all the drivers preinstalled and preconfigured:

http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/release/notes/OL_26617.html

PLEASE CONSIDER AWARDING any HELPFUL or CORRECT answer. Thanks!!
Por favor CONSIDERA PREMIAR cualquier respuesta ÚTIL o CORRECTA. ¡¡Muchas gracias!!
Blogs: https://danieloprea.blogspot.com/
Reply
0 Kudos