spex
Expert
Expert

ESXi freezing with U4 on HP server

Since we are using esxi 3.5 U4 with all patches (hp version u3 - then updated) we already had 3 times freezing the esx host.

The esx host still answers on pings and offers a login at the console. If you enter login data the esx host does not respond with a new screen. The vm's running on that host do sometimes still respond to pings. HA does not recognize the situation since the esx host responds to pings. The only thing we can do is to powetr off the server and reboot.

Does anyone else had that problem. We had the issue on 2 different hosts (DL380 and DL580).

Regards

Spex

0 Kudos
10 Replies
Dave_Mishchenko
Immortal
Immortal

At the console can you press ALTf1 and ALTf12 to see if there are any errors there?

0 Kudos
spex
Expert
Expert

We were not able to switch the screens. I already rebooted the esx server. Gathering diag data for support.

Regards

Spex

0 Kudos
spex
Expert
Expert

We found, that the console does not get enough cpu. Although overall cpu usage at the esx was under 30%, the service console answer times were very bad on a new server and a new freeze was expected shortly.

We could solve the problem by rising cpu reservation for console up to 500 MHz. At the moment server look good. doublecross our fingers and hope VMware finds the reason.

Regards Spex

0 Kudos
donnieq
Enthusiast
Enthusiast

I've seen this same issue on HP BL460c systems using HP ESXi 3.5 U4 on USB drives. Are you using USB drives or SAS/SCSI drives? I suspected the USB drives still being faulty despite being replaced by HP. Looking at the performance data, the server seems to be OK with the current reservations; however, I'm going to increase them a bit to see what happens.

As for the console, login to the F2 Yellow/Black screen freezes; however, I can use ALT+F1 and get into unsupported mode and work with the hypervisor without issue. Typically, the VM guests go down, the host is disconnected from vCenter, yet the host is pingable and somewhat usable via the CLI.

0 Kudos
spex
Expert
Expert

This weekend again 2 esx servers crashed. ESX and vm's not responing, esx still pingable...

Going back to older softwarelevel.

Regards

Spex

0 Kudos
spex
Expert
Expert

We use SCSI drives for esxi.

I wonder why not more people are having problems with u - what is special with our configuration?

Regards

Spex

0 Kudos
spex
Expert
Expert

Since downgrade to u3 we have no more problems. Priority 1 ticket at VMWare did not solve the problem until now...

Regards

Spex

0 Kudos
donnieq
Enthusiast
Enthusiast

We were able to regain stability by rebuilding our VMware ESXi hosts with the 3.5.0 U4 image directly from HP: . Thereafter, we applied the 10-Apr and 29-Apri patches via VMware Update Manager. We also moved to SAS drives rather than USB drives. Finally, at this time our VMware ESXi environment is not monitored by HP SIM—it was suspected that HP SIM was faulting the ESXi kernel when communicating with the host agents. We'll be adding the environment back to HP SIM after each host is rebuilt. I'll be certain to post any other details when the become available.

Don

Mikeluff
Contributor
Contributor

I have the same issue on all my DL385 G2 and G5 hardware.... I will try rebuilding as you state.

0 Kudos
donnieq
Enthusiast
Enthusiast

Additionally, we received an indication from VMware support today that these errors and quite possibly the instability are due to an issue with CIM. VMware provided steps to disable CIM and thus far the errors have not returned. We'll continue to monitor the stability on the BL460c G1's.

Don

0 Kudos