VMware Cloud Community
inck
Contributor
Contributor
Jump to solution

ESXi host not responding

Hello everyone!

I have several ESXi hosts and one vCenter appliance. All infrastructure working on 6.5 version. All hosts are in DRS cluster with no automatic migration enabled - when it was enabled the situation described below happens more frequently.

There is a huge problem in this environment: host are hardly ever stays in Connected state. Most of the time they are in Not responding state.

In this time all VMs are working properly, I can log in to ESXi SSH shel and to vCenter itself.

i _cannot_ login to:

  1. DCUI - it hangs up on password screen
  2. WebUI - It either hangs on blank blue screen or it said "Connection to ESXi host timed out"

Every esxcli command is hanging up, also as df -h command.

I cannot restart management services.

If I reboot the system, it goes online and work for some time. but that will happen again.

Sometimes host are flapping and shown as Connected in the vCenter for several minutes.

Reply
0 Kudos
1 Solution

Accepted Solutions
ujjwal2018
VMware Employee
VMware Employee
Jump to solution

Hello,

Plaese check the vobd.log and make sure you are not seeing any vmnic flapping issue or APD issue.

cat /var/log/vobd.log | grep -i vmnic

cat /var/log/vobd.log | grep -i "All Paths Down"

cat /var/log/vmkernel.log | grep -i "All Paths Down"

>>>> Share the output of the above ( if possible )

You can refer the below KB article also -

1. VMware ESX/ESXi 4.x/5.x and 6.x hosts in All-Paths-Down (APD) condition may appear as Not Responding in VMware vCenter Server - https://kb.vmware.com/s/article/1030980

2. Permanent Device Loss (PDL) and All-Paths-Down (APD) in vSphere 5.x and 6.x - https://kb.vmware.com/s/article/2004684 

" Please consider marking this answer "correct" or "helpful" if you think your question have been answered correctly."

Regards,

UJ

View solution in original post

Reply
0 Kudos
5 Replies
Deso1ator
Enthusiast
Enthusiast
Jump to solution

Not responding hosts is usually a storage related issue. How's the performance of your SAN? Are you running the latest recommended firmware and drivers for your hosts?

Reply
0 Kudos
ujjwal2018
VMware Employee
VMware Employee
Jump to solution

Hello,

Plaese check the vobd.log and make sure you are not seeing any vmnic flapping issue or APD issue.

cat /var/log/vobd.log | grep -i vmnic

cat /var/log/vobd.log | grep -i "All Paths Down"

cat /var/log/vmkernel.log | grep -i "All Paths Down"

>>>> Share the output of the above ( if possible )

You can refer the below KB article also -

1. VMware ESX/ESXi 4.x/5.x and 6.x hosts in All-Paths-Down (APD) condition may appear as Not Responding in VMware vCenter Server - https://kb.vmware.com/s/article/1030980

2. Permanent Device Loss (PDL) and All-Paths-Down (APD) in vSphere 5.x and 6.x - https://kb.vmware.com/s/article/2004684 

" Please consider marking this answer "correct" or "helpful" if you think your question have been answered correctly."

Regards,

UJ

Reply
0 Kudos
sk84
Expert
Expert
Jump to solution

I agree with the previous answers. Usually a storage problem is the cause of unresponsive shells and UIs. Maybe there is an APD or PDL event in the logs. Otherwise I would check the IOPS load on the storage.

Best regards,

Sebastian

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
Reply
0 Kudos
inck
Contributor
Contributor
Jump to solution

Hello,

Thank you for your answer. I have checked vobd.log and it contains some problems I'll look into. vmkernel.log hasn't.

But here's some "new" facts: I have blocked any network access from vCenter server to ESXi hosts, and now all hosts are 'up' and reachable via WebUI or any other. and all esxcli commands or df -h are executed in normal way. How can it be relevant to this issue?

Thanks.

Reply
0 Kudos
rullybikeco
Contributor
Contributor
Jump to solution

hi inck,

 

I'm having the same problem as yours. Could you share your steps to resolve this?

 

Thank you

Rully

Reply
0 Kudos