VMware Cloud Community
Skmr
Enthusiast
Enthusiast
Jump to solution

I am seeing one host not responding on vCenter

Hi,

one of my host is showing as not responding in my vCenter on testing environment.

I can able to ping it which is interesting

1 Solution

Accepted Solutions
Sravan_k
Expert
Expert
Jump to solution

Hi pkmr,

Done with analyzing on your PSOD issue, please follow this KB for resolution "https://kb.vmware.com/s/article/2140848?language=en_US"

Solution: please upgrade to latest update of ESX 6.0 or at-least go to ESX 6.0 U2 to fix this bug 

Reason for issue:

PCPU becomes too busy logging all the correctable error messages to perform routine background tasks which leading ESXi to assume that PCPU is unresponsive. and finally it is causes a purple diagnostic screen and showing as host not responding in vCenter.

Please let me know if you have any questions?

View solution in original post

0 Kudos
11 Replies
Sravan_k
Expert
Expert
Jump to solution

Quick questions,

Please open SR with vmware if it is on high priority

Please provide me your ESX  version, vCenter version and is there any other host is experiencing this issue? who is hardware vendor?

please message me your log's if you feel not posting to this thread.

Regards,

VKmr.  

bhards4
Hot Shot
Hot Shot
Jump to solution

Hi,

Please check below.

1) Is the ESXi host accessible from SSH

2) Is the ESXi host accessible as standalone host from vSphere client.

3) Whats the status of VM, are they online or in orphened status.

4) Have tried to reconnect the host from vCenter.- If yes, whats the error you are getting .

5) Please see into the ESXi  hostd.logs and found if there is any hostd non-responsive status mentioned.

6) If possible attached the hostd.log, vobd.log, vmkwarning.logs

-Sachin

Skmr
Enthusiast
Enthusiast
Jump to solution

Hi Vkmr,

Thanks for responding, this was happened in my development environment so it was not high priority

I moved all VMs to other host and kept it under maintenance mode, I am analyzing hostd log's as bhards4 said, as of now I didn't find anything

I am using ESX 6.0 U1 and my hardware vendor is Cisco, please find the log's that messaged you hardware and vmware logs.

thinking we have some issues with DIMMs 

0 Kudos
Skmr
Enthusiast
Enthusiast
Jump to solution

1) Is the ESXi host accessible from SSH

I can not

2) Is the ESXi host accessible as standalone host from vSphere client.

No, but after restart I can able too

3) Whats the status of VM, are they online or in orphened status.

it is showing as disconnected.

4) Have tried to reconnect the host from vCenter.- If yes, whats the error you are getting .

after restarting it was added automatically.

.

5) Please see into the ESXi  hostd.logs and found if there is any hostd non-responsive status mentioned.

I am looking into it, think some issue with DIMMs based on my environment.

6) If possible attached the hostd.log, vobd.log, vmkwarning.logs

I will send you all VMware support logs for this host

0 Kudos
Sravan_k
Expert
Expert
Jump to solution

Hi Pkmr,

Thanks for providing information, I believe you as you said some issue with DIMMs, I will start with chassis log's  

also when you launch your KVM are you seeing any purple screen? I need this information to analyze your VMware log's

Regards,

Vkmr.

Skmr
Enthusiast
Enthusiast
Jump to solution

I think it was purple screen

0 Kudos
admin
Immortal
Immortal
Jump to solution

VMware ESX/ESXi host that is in a Not Responding state

ESXi

  1. Verify that the ESXi host is in a powered ON state. For more information, see Determining why an ESXi/ESX host was powered off or restarted (1019238).
  2. Verify that the ESXi host can be reconnected, or if reconnecting the ESXi host resolves the issue. For more information, see Changing an ESXi or ESX host's connection status in vCenter Server (1003480).
  3. Verify that the ESXi host is able to respond back to vCenter Server at the correct IP address. If vCenter Server does not receive heartbeats from the ESXi host, it goes into a not responding state. To verify if the correct Managed IP Address is set, see Verifying the vCenter Server Managed IP Address (1008030) and ESXi 5.0 hosts are marked as Not Responding 60 seconds after being added to vCenter Server (2020100). See also, ESXi/ESX host disconnects from vCenter Server after adding or connecting it to the inventory (204063... and ESX/ESXi host keeps disconnecting and reconnecting when heartbeats are not received by vCenter Serve....
  4. Verify that network connectivity exists from vCenter Server to the ESXi host with the IP and FQDN. For more information, see Testing network connectivity with the ping command (1003486).
  5. Verify that you can connect from vCenter Server to the ESXi host on TCP/UDP port 902. If the host was upgraded from version 2.x and you cannot connect on port 902, then verify that you can connect on port 905. For more information, see Testing port connectivity with Telnet (1003487).
  6. Verify if restarting the ESXi Management Agents resolves the issue. For more information, see Restarting the Management agents on an ESXi or ESX host (1003490).
  7. Verify if the hostd process has stopped responding on the affected ESXi host. For more information, see Troubleshooting vmware-hostd service if it fails or stops responding on an ESX/ESXi host (1002849) 
  8. The vpxa agent has stopped responding on the affected ESXi host. For more information, see Troubleshooting the vCenter Server Agent when it does not start (1006128) 
  9. Verify if the ESXi host has experienced a Purple Diagnostic Screen. For more information, see Interpreting an ESX/ESXi host purple diagnostic screen (1004250) 
  10. ESXi hosts can disconnect from vCenter Server due to underlying storage issues. For more information, see Identifying Fibre Channel, iSCSI, and NFS storage issues on ESXi/ESX hosts (1003659).
Sravan_k
Expert
Expert
Jump to solution

cool thanks for quick update as of now I didn't find any issues with DIMM's let me dig into more

0 Kudos
TomHowarth
Leadership
Leadership
Jump to solution

can you follows the processes contained in the following KB and extract the details of the PSOD

VMware Knowledge Base

Tom Howarth VCP / VCAP / vExpert
VMware Communities User Moderator
Blog: http://www.planetvm.net
Contributing author on VMware vSphere and Virtual Infrastructure Security: Securing ESX and the Virtual Environment
Contributing author on VCP VMware Certified Professional on VSphere 4 Study Guide: Exam VCP-410
Sravan_k
Expert
Expert
Jump to solution

Hi pkmr,

Done with analyzing on your PSOD issue, please follow this KB for resolution "https://kb.vmware.com/s/article/2140848?language=en_US"

Solution: please upgrade to latest update of ESX 6.0 or at-least go to ESX 6.0 U2 to fix this bug 

Reason for issue:

PCPU becomes too busy logging all the correctable error messages to perform routine background tasks which leading ESXi to assume that PCPU is unresponsive. and finally it is causes a purple diagnostic screen and showing as host not responding in vCenter.

Please let me know if you have any questions?

0 Kudos
Skmr
Enthusiast
Enthusiast
Jump to solution

Hi Vkmr,

I was seeing your message, it was clear to me, thanks for taking time on my issue,

Do you know why it [bug] happened to only this host but not other even though few other hosts are running same ESX 6.0 U1 version?  

Thank you

pkmr.

0 Kudos