VMware

This Question is Possibly Answered

1 "correct" answer available (10 pts)
5 Replies Last post: Feb 16, 2009 8:10 AM by Texiwill  

ESX 3.5 Frozen posted: Feb 15, 2009 3:47 AM

Click to view wolf's profile Enthusiast 43 posts since
Jan 27, 2005
Hi,

we are experiencing this issue on an IBM x440.

  • ESX 3.5 latest patch (up to end of Jan)
  • Host free (no vms running)

After some hours machine freezes (physical console is printing the usual ALF+1 message, but nothing happens) and the only way to get it back is to power it off.
No messages in /var/log/messages or /var/log/vmkernel: a pure "Hole" since freeze time to next reboot.

Any suggestions where to look for ? Do you think is an hardware issue ?

We found a SMI Hdlr - 00150900 SERR / PERR on PCI Bus aound the time it froze

Thanks

Re: ESX 3.5 Frozen

1. Feb 15, 2009 4:01 AM in response to: wolf
Click to view Lightbulb's profile Virtuoso 1,408 posts since
Aug 15, 2008
Hardware would be a good place to start. When systems go down (or lock up) without any logged cause I usually start swapping out components. First though check to make sure that no recent changes have been made i.e. patching or a reconfiguration of the system.

I do not have recent experience with IBM servers. You should check the support page for this series and see if there are any KBs that might be related to this issue. Also if need be engage the vendor.

Re: ESX 3.5 Frozen

2. Feb 15, 2009 5:45 AM in response to: wolf
Click to view benma's profile Hot Shot 199 posts since
Jul 2, 2008

Have you installed the IBM Director Agent?

If yes check ur server for HW errors. On your Host

aren't any VMs running, so you also can run a memtest

Re: ESX 3.5 Frozen

4. Feb 15, 2009 12:56 PM in response to: wolf
Click to view Lightbulb's profile Virtuoso 1,408 posts since
Aug 15, 2008
So do I understand correctly?

1. You have on ESX host affected with this lockup issue and one not?

2. Both were recently patched to the same patch level and both displayed unusual behavior?

As to the missing Health status that might be resolved be removing and adding the ESX host back to VC.

You may want to give Vmware a call see if there are any known issues with your model and the patches you recently applied.

Re: ESX 3.5 Frozen

5. Feb 16, 2009 8:10 AM in response to: Lightbulb
Click to view Texiwill's profile Guru User Moderators vExpert 10,432 posts since
Jan 13, 2004
Hello,

Some of the updates actually require recent firmware, etc. or they could have uncovered issues with your hardware. Yes, it can happen. To that end first vet the hardware.

0) Open up the box and look for obvious issues (I had a broken heatsink one time), reseat cards and memory.
1) Update Firmware/BIOS on the hosts
2) Verify BIOS has appropriate settings for ESX
3) Run memtest86+ for at least24-48 hours
4) Run Vendor supplied hardware diags for at least 24-48 hours, ignore tests for disk resets if possible

If all that pans out and you did upgrade firmware/BIOS reboot hosts into ESX and see if things work once more.

If not then contact Hardware Vendor as well as your VMware Support Rep as it sounds like you have other issues going on. Also, keep a terminal/monitor on the box that does not time out to capture any crashes, etc.


Best regards,
Edward L. Haletky
VMware Communities User Moderator
====
Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.
Blue Gears and SearchVMware Pro Blogs -- Top Virtualization Security Links -- Virtualization Security Round Table Podcast

VMware Beta Programs

Want to be Considered for Future Beta Programs?

Learn More

VMware Developer

Download SDKs, APIs, videos,
training, and more in the Developer community.

Learn More

Developer
Sample Code

Increase your developer productivity with VMware API sample code.

Learn More

VMworld
Sessions & Labs

Online access to the latest VMworld Sessions & Labs and online services.

Learn more

Purchase PSO Credits Online

Purchase credits to redeem training and consulting services online.

Buy Now

Community Hardware Software

View reported configurations or report your own.

Learn More

Only VMware ... Delivers Nexus 1000V

Ensure consistent, policy-based network capabilities to virtual machines across your data center.

Learn More

Communities