VMware Cloud Community
MBrownHenn
Contributor
Contributor

Random VM Lockups.

Random VM Lockups. Do Screen Savers, Power Settings and Wake on LAN matter to a VM? Give a solution, I give you points! Confirm one of my proposed solutions, I give you points!

Environment:

ESX hosts: 3.0.2 build 89841

VIS/VIC: 2.0.2 build 75762 (Update 3)

VMTools: build 86539 on all VM's

VM Os: Windows 2003 Enterprise Edition and Standard Edition SP1 and SP2.

I have had several support tickets with VMware over the past 18 months on this issue. So far we are still suffering with Random VM Lockups.

What happens:

At 8:00 PM, but not always on the same day of the week, a select few VM's lock up and stop responding to anything EXCEPT a ping. The console is non-responsive and no management or RDP services work to connect to the VM.

What we know and have done so far :

  • We have been through 3 version of ESX, 4 to 5 versions of the VMTools and applied 3 updates to VirtualCenter all with no affect. This is not happening on all VM's, only a select few. The behavior follows the VM's, not the hosts. Significant changes and updates have occured in the VMware ESX/VirtualCenter Infrastructure with no change to this VM lockup issue.

  • So far there has been no action taken to change the OS configuration of the affected VM's. We are working on a solution, but nothing has been applied yet. See below for proposed OS changes. The template was updated and VM's built from the newer templates do not suffer this ill.

  • When a VM locks up, it happens at 8:00 PM - every time.

  • NetBackup is not installed on all the affected VM's

  • The VM's do not do SAV scans at 8:00 PM consistently

  • The VM's do not have Defrag scheduled at 8:00 PM

  • Local logs on the VM reveals nothing

  • ESX and VC logs show no unusual activity

  • It can happen just about any day/night of the week, but most frequently on Fridays and Sundays (?).

  • It happens on the "older" mammal servers, specifically those built from the Q107 template. Although the VM Song is also affected. Song was build in Jan 07, which is close to when all the other VM's that are suffering this problem were built. This is consistent with a configuration problem either in the VMTools or the OS on the Q1 2007 template.

Proposed Actions/Solutions

  • Apply the latest ESX host patches on 6/28. New patches were released on 6/3 and one on 6/13. However, none of them address this issue specifically. The patches are still needed due to security reasons and critical configuration updates.

  • Apply Update 4 to VirtualCenter on 6/28. Again, there is nothing specific in VC Update for that would address this, but it is recommended.

  • VMware Recommended - Configure VM's with no screen saver and no power settings (no reboot required). This will eliminate a possible cause. Jude, Jerry, Bob and I are working on a Group Policy solution for this. The VM's that are locking up are set to have a screen saver kick in after 10 minutes, password protect it, turn off the monitor after 20 minutes and "power off" the server when the power button is pressed. These settings should all be null on a VM. The VM's built from newer templates do not have these settings configured and have not suffered the lock-up problem. But why at 8:00 PM only? My hypothesis is that there is some unknown activity (not SAV, not NetBackup, not Defrag) that kicks off and conflicts with these power/screen saver settings at 8:00 PM.

  • VMware Recommended - Configure the VM properties settings to Wake on LAN (requires shutdown and restart of the VM) and select the network adapter to use in the VM properties Options Tab under Power Management. This setting is also consistently "unchecked" on all affected VM's. If the VM OS goes into standby mode (why it would I have no idea!) then VMware cannot wake it up unless this is set correctly.

  • ExpertsExchange Recommended - Uninstall the VMTools and then reinstall from scratch (requires three (3) VM reboots and significant manual intervention). The sledge hammer approach. The problem with this is that we never really know what the cause was in VMTools (if it helps at all) and if we are changing a setting with a rip-n-reinstall or if we are actually solving a problem with the VMTools install files.

  • Rebuild all the affected VM's from newer VM templates and have the applications reinstalled.

  • Continue to do nothing on the affected VM's. So far this option has not brought about the desired changes.

Thanks.

Michael T. Brown

Senior Systems Engineer, MCSE

Hennepin County EITS

Direct: 612.596.9674

Mobile: 612.919-4859

Reply
0 Kudos
2 Replies
Troy_Clavell
Immortal
Immortal

We had these issues on our XP devices, and what solved it for us was to disable all the power options.

see this thread.

http://communities.vmware.com/thread/132522?tstart=0

Reply
0 Kudos
bradley4681
Expert
Expert

I normally disable all the power options, screen savers, and unneeded services in all my VM's. I try to go with a less is more strategy, i.e. the less crap i have in there, the more stable...

Cheers,

Bradley Sessions

If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".

Cheers! If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
Reply
0 Kudos