Since we are using esxi 3.5 U4 with all patches (hp version u3 - then updated) we already had 3 times freezing the esx host.
The esx host still answers on pings and offers a login at the console. If you enter login data the esx host does not respond with a new screen. The vm's running on that host do sometimes still respond to pings. HA does not recognize the situation since the esx host responds to pings. The only thing we can do is to powetr off the server and reboot.
Does anyone else had that problem. We had the issue on 2 different hosts (DL380 and DL580).
We found, that the console does not get enough cpu. Although overall cpu usage at the esx was under 30%, the service console answer times were very bad on a new server and a new freeze was expected shortly.
We could solve the problem by rising cpu reservation for console up to 500 MHz. At the moment server look good. doublecross our fingers and hope VMware finds the reason.
I've seen this same issue on HP BL460c systems using HP ESXi 3.5 U4 on USB drives. Are you using USB drives or SAS/SCSI drives? I suspected the USB drives still being faulty despite being replaced by HP. Looking at the performance data, the server seems to be OK with the current reservations; however, I'm going to increase them a bit to see what happens.
As for the console, login to the F2 Yellow/Black screen freezes; however, I can use ALT+F1 and get into unsupported mode and work with the hypervisor without issue. Typically, the VM guests go down, the host is disconnected from vCenter, yet the host is pingable and somewhat usable via the CLI.
We were able to regain stability by rebuilding our VMware ESXi hosts with the 3.5.0 U4 image directly from HP: . Thereafter, we applied the 10-Apr and 29-Apri patches via VMware Update Manager. We also moved to SAS drives rather than USB drives. Finally, at this time our VMware ESXi environment is not monitored by HP SIM—it was suspected that HP SIM was faulting the ESXi kernel when communicating with the host agents. We'll be adding the environment back to HP SIM after each host is rebuilt. I'll be certain to post any other details when the become available.
Additionally, we received an indication from VMware support today that these errors and quite possibly the instability are due to an issue with CIM. VMware provided steps to disable CIM and thus far the errors have not returned. We'll continue to monitor the stability on the BL460c G1's.