VMware Cloud Community
gcsc
Enthusiast
Enthusiast
Jump to solution

ESXi 5.5 DCUI unrepsonsive

I have 2x Dell M620 blades servers running ESXi 5.5, which are connected to vCenter. Last night I tried to connect to the DCUI through the drac to host 2 but the keyboard didn't work( F2 or F12). At this point I assumed it was a problem with the drac so I went down and plugged a keyboard directly into the host but the same thing happened. I then decided to put the host into maintenance mode and give it a reboot but this didn't work either. The host came back out of maintenance mode and I was able to vMotion VMs back to it and everything appears to be working fine. There's no errors or alerts in the vSphere client but I'm stumped to which logs or what to look for. There's nothing on google, so it's either rare or stupidly obvious.  Anyone else experienced this before ?

Reply
0 Kudos
1 Solution

Accepted Solutions
gcsc
Enthusiast
Enthusiast
Jump to solution

I reverted the bootbank from 5.5.0-2.33.2068190 to 5.50-2068190 which didn't work and I also couldn't get the host back in the cluster, so I ended up re-installing and reconfiguring it.

View solution in original post

Reply
0 Kudos
14 Replies
SureshKumarMuth
Commander
Commander
Jump to solution

Hi,

If the issue occurs due to OS, then we would have some logs in the host logs or some indication about why the host was frozen. If suddenly, host stopped responding and if you are unable to find any logs, then I would recommend you to check at hardware end if you are able to see any issues. Perform a CPU or Memory intensive diagnostics test at the host hardware and update the BIOS to the latest version.

Regards,

Suresh

Regards,
Suresh
https://vconnectit.wordpress.com/
Reply
0 Kudos
vNEX
Expert
Expert
Jump to solution

Welcome to the community,

as workaround you can try to connect via SSH and run DCUI from that session.

Just type DCUI followed by Enter... to "quit" back to the console press Ctrl+C

If this will work I would suspect some issue with Blade enclosure ... do you experince this issue on the second Blade server?


_________________________________________________________________________________________ If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards, P.
Reply
0 Kudos
gcsc
Enthusiast
Enthusiast
Jump to solution

The first host is working fine, it's only the second host that has the problem. Last time it was known to be working was about 3 months ago when updates were run.

I was able to make the change I needed after finding this

http://www.virtuallyghetto.com/2011/07/how-to-add-splash-of-remote-color-to.html

However I'm thinking maybe this is could be part of a bigger problem, so I'm trawling through the host logs https://host-ip/host but unfortunately they don't make a whole lot of sense to me.

Reply
0 Kudos
vNEX
Expert
Expert
Jump to solution

OK can you please post vmkernel.log for analysis and I would be also interested in ESXTOP CPU stats output.

This behavior could be also caused by faulty HW or just a driver/firmware issue...?!

In the meantime verify your firmware versions and update to the latest if possible.

Simplest try is to revert back to the previous state and boot the host from alternate bootbank (in ESXi boot sequence SHIFT+R)

and try if anything changed? (just for prove if later updates has brought some inconsistencies...) 

_________________________________________________________________________________________ If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards, P.
gcsc
Enthusiast
Enthusiast
Jump to solution

The blade and blade chassis firmware was updated when ESXi on the hosts were updated around November last year.

I've attached the ESXTOP CPU output and the vmkernal log - I renamed the servers to VMserver

The ESXTOP CPU output is showing  high CPU usage for dcui.213978

Before you ask, no there's not 2 keyboards connected lol

DCUI – High CPU usage on ESXi | Tom Fojta's Blog

Reply
0 Kudos
vNEX
Expert
Expert
Jump to solution

regarding DCUI usage have a look at this KB:

VMware KB: ESXi 5.0 fails to respond or is disconnected from vCenter Server because of the locked es...

in the meantime I will try to dive into the log...

_________________________________________________________________________________________ If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards, P.
Reply
0 Kudos
gcsc
Enthusiast
Enthusiast
Jump to solution

Thanks for the help Smiley Happy

The esx.conf.LOCK file doesn't exist, I assume this means there's no lock on the esx.conf file ?

Reply
0 Kudos
vNEX
Expert
Expert
Jump to solution

That means we are out of luck...;-)

unfortunately I'am not at office right now to inspect your logs, what about to rollback to previous image using AltBootbank...?!

_________________________________________________________________________________________ If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards, P.
Reply
0 Kudos
gcsc
Enthusiast
Enthusiast
Jump to solution

It's on my to do list, I'll let you know how it goes.

Reply
0 Kudos
gcsc
Enthusiast
Enthusiast
Jump to solution

When I revert to the alt bootbank image does it make it the primary bootbank,do I lose the current primary bootbank ?

Reply
0 Kudos
vNEX
Expert
Expert
Jump to solution

If I am not mistaken when you revert primary bootbank just become secondary to which you can revert....

Anyway before such operation you can backup host configuration for more on that see:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=204214...

_________________________________________________________________________________________ If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards, P.
Reply
0 Kudos
gcsc
Enthusiast
Enthusiast
Jump to solution

Is there anyway to check what's in the alt bootbank, such as what ESXi version/build or even when it was created ?

Reply
0 Kudos
vNEX
Expert
Expert
Jump to solution

when you hit Shift+R you will enter the Hypervisor Recovery screen ... under Installed Hypervisors

you will see what is the current (default) build and what's the build of the second Alt-image.

_________________________________________________________________________________________ If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards, P.
Reply
0 Kudos
gcsc
Enthusiast
Enthusiast
Jump to solution

I reverted the bootbank from 5.5.0-2.33.2068190 to 5.50-2068190 which didn't work and I also couldn't get the host back in the cluster, so I ended up re-installing and reconfiguring it.

Reply
0 Kudos