VMware Cloud Community
Stewd
Contributor
Contributor

Windows 2008 R2 Randomly Locking Up (ESXi 4.0 Update 1)

I am having issues with Windows 2008 R2 locking up randomly even after applying patches (including Update 1). I have verified that the patches are up to date with vSphere Host Update Utility as of Feb. 19, 2010 and have updated VMWare Tools on the Windows 2008 R2 server after applying patches. I am able to ping the server, but like other people that have reported the issue, VMWare console stops responding and I can't Remote Desktop to the server. I have tried to remotely restart the server using "shutdown" command, but that doesn't work either.

Thanks in advance

Stew Davis

0 Kudos
32 Replies
bcoover
Contributor
Contributor

I am having the same problem, and this is bordering on ridiculous. I have numerous 2008 R2 VMs that freeze/hang/lockup randomly and do so with 100% CPU utilization. If you look at the console (I try to leave task manager running) there is no apparent process culprit, and it freezes hard, i.e., the video literally freezes. Strangely, this only occurs on the boxes running Exchange 2010. I have tried removing almost everything from the machines except for Exchange and none of that makes any difference. The only way to recover them is to do a hard reset of the VM. Sometimes the VMs freeze a few minutes after reboot, sometimes it takes a couple of days.

I've tried all of the remedies I could find, including (nothing helps):

Fix/remove SVGA driver

Change hard disk type

Change network adapters

Remove VMware Tools

Remove all AV software

Reload OS

Reload Exchange 2010 (not running UM role)

Tried 1 and 2 vCPU configurations

My setup is on a HP DL380G5 with intel x5460 with 8 cores, physical hosts are very underutilized. This is extremely frustrating and is making me consider switching to Hyper-V, please help!!!

0 Kudos
iLikeMoney
Enthusiast
Enthusiast

Just curious, are you using version 4 hardware or 7? I'm going to be rolling one of these R2s out soon myself and have been using version 4 on all of my templates.

0 Kudos
tWiZzLeR
Enthusiast
Enthusiast

It's a video driver issue, please check out this post: http://communities.vmware.com/message/1458813#1458813

Per

you must manually browse to "C:\Program Files\Common

Files\VMware\Drivers\wddm_video" folder to install the WDDM driver. If

you only browse to the "C:\Program Files\Common Files\VMware\Drivers"

folder, as suggested in the KB article, when Windows searches for a

driver it will find the incorrect VMware SVGA II driverin the

"C:\Program Files\Common Files\VMware\Drivers\video" folder and install

it. My guess is because v comes before w alphabetically and it installs

the first driver that it finds.

Again, you must manually browse to "*C:\Program Files\Common

Files\VMware\Drivers\wddm_video*" to install the WDDM driver in

Windows 2008 R2.

So far I have not experienced the lockup issue after performing both of

the steps above.

0 Kudos
bcoover
Contributor
Contributor

I've tried all of the different combinations with the video, SVGA, VGA(Microsoft's generic one), WDDM, etc, and none of them seem to make any difference. Nor is this a problem that arises from the console, and I really only use RDP to connect to these servers. I'm still hoping VMware acknowledges that there is a serious incompatibility issue out there with R2 and get it fixed pronto. It appears that this surfaced last summer and although there have been a few other culprits (like the video) which seemed to fix the problem, these haven't solved the problems I've seen.

0 Kudos
bcoover
Contributor
Contributor

This is all version 7

Best Regards,

 

Brett Coover  | CTO & Managing Partner| CCIE, CISSP

Direct: 813.727.8747 | Office: 888.698.1718 x708 | www.extropy.com | brett.coover@extropy.com

LinkedIN: http://www.linkedin.com/in/brettcoover

0 Kudos
tWiZzLeR
Enthusiast
Enthusiast

Sorry, I guess that I should have read the original post a little closer.

However, I can say that I have 3 ESX hosts and a SAN and I'm now running two Windows 2008 R2 servers in production, including one that is a domain controller, without issue. I haven't had any lockup, slowness, etc. (so long as we're using the correct WDDM driver).

I'm wondering if you might be having a different issue that may not be VMware specific? For example, AD group policies that are incompatible with 2008R2 or network/DNS issues. Do you have a physical 2008R2 server on the same network using the same settings?

0 Kudos
bcoover
Contributor
Contributor

Yes, there are both physical and virtual servers, and I'm running about 30 2008 R2 VMs. There isn't any commonality across the machines that are failing except that they all happen to be running Exchange 2010, which is supposedly supported...

Best Regards,

 

Brett Coover  | CTO & Managing Partner| CCIE, CISSP

Direct: 813.727.8747 | Office: 888.698.1718 x708 | www.extropy.com | brett.coover@extropy.com

LinkedIN: http://www.linkedin.com/in/brettcoover

0 Kudos
Stewd
Contributor
Contributor

I was having the same problem with Exchange 2010 with Windows 2008 R2. I had read some previous posts about not loading the VMWare tools video driver, although Update 1 was supposed to fix this. Since I stopped running the video driver from VMWare tools I have not had the server lock up anymore. I have had some other R2 servers that were experiencing the same problem, but it seemed to be more frequently happening on the Exchange 2010. So far, omitting the video driver from VMWare Tools seems to have resolved the issue.

0 Kudos
JHeckmann
Contributor
Contributor

Hello Brett,

did you solve the problem, I have the same problem like you with 2008R2 and exchange 2010. It works fine for 4 weeks, than the the exchange vm freeze and only a reset helps. Now the vm freeze on time per Day. The Blackberry Enterprise Server runs fine (2008R2) on the same ESXI.

ESXI 4.0 Update 1 installed.

Thanks in advance

Joachim Heckmann

0 Kudos
bcoover
Contributor
Contributor

Nope, still no better experiences. I have since tried a few more things with no better results, here is the current list:

• I've only been able to duplicate it with these exact circumstances: ESXI 4.0.1, dl380G5 (tried on machines with different intel processors), 2008 R2 (x64), Exchange 2010

• I don't have problems running it under VMware fusion or workstation

• Network card changes don't affect it (it definitely is not the NIC problem, because it happens even if the machines have no NIC)

• Already tried the video thing (it definitely isn't this, as I tried every possible combination)

• Changed every other conceivable piece of hardware and software

• It just runs fine and at random times just freezes with high CPU utilization

• Tried disabling/enabling independent disks, doesn't help

• Tried changing storage types, didn't help (Local SAS, external SCSI, USB, etc)

• Tried changing number of vCPUs, memory, etc

• Tried Exchange 2010 with and without DAGs and Failover Cluster Service

• Disabled IPv6 and associated services

I've tried running debugs, packet captures, etc, and I can find no patterns. This is very frustrating because I am running a bunch of other 2008 R2 machines and not a single one has problems except for Exchange 2010 boxes. I had a sneaking suspicion that this was something to do with clustering and the fact that microsoft clustering isn't supported on vmware hosts that are clustered (HA, vmotion, DRS, etc). However, since I took those pieces out of the equation, we are still seeing the problem, so I have to conclude by matter of elimination there is something with the combination of Exchange 2010, Server 2008 R2 and ESXi 4.0/4.0.1 that freezes the guests. Currently I am trying to get Exchange 2010 loaded on 2008 sp2 (non R2) to see if that freezes too... wish me luck!

I think that we have a confirmed VMware bug here, but I don't currently have a support contract that will allow me to open a ticket, who's up for that?

0 Kudos
bcoover
Contributor
Contributor

Workaround update!

Ok, here's the deal: Exchange 2010 works fine now that I've installed the stack on Server 2008 sp2 (not R2). No freezes, and in fact no problems of any kind. If anything it feels faster to me in terms of GUI consoles, responsiveness, etc. I even got Forefront and DAGs to work just fine on this platform.

What does this mean? I think it means that we have positively identified a serious bug that both VMware and Microsoft should be taking seriously. If Microsoft caused this to promote their Hyper-V product, then that's a definite misstep. If VMware knows this is a problem and isn't officially acknowledging it yet, then that is also a big misstep. We have positively identified that on some platforms, Exchange 2010 on Server 2008 R2 simply will not work properly. There is NO existing change, variable or patch that helps (vcpu, memory,video, network, storage, etc). I've currently only heard reports from individuals running this on HP DL380 G5 systems; although this might indicate a possible culprit, that is hard to say because this HW is one of the most popular platforms on earth. I have confirmed that the issue is identical on Intel 5160 and 5460 processors alike.

So to summarize: EWE = URFM (ESXi 4.0.1 + Windows server 2008 R2 + Exchange 2010 = Unusable Randomly Freezing Machines)

I've also posted this to my blog along with some other things I've figured out about these platforms: http://www.extropy.com/forums/knowledge-bases-extropedia/microsoft/exchange-2010-server-2008-r2-vmwa...

0 Kudos
pjones1981
Contributor
Contributor

Hi Guys,

I have recently created a test Lab prior to a rollout of Exchange 2010.

VMware ESX servers are running vSphere4 update 2 and VM's have the latest VMware tools installed.

Windows Server 2008 R2 installed.

Exchange 2010 installed.

I am having problems with VM's freezing or hanging and the only way (as above) is to reset the VM. Currently this is only test but i would like to roll this out on an R2 platform.

Has anyone found a fix for this problem yet other than running SP2?

Many thanks in advance.

0 Kudos
koolkat11
Contributor
Contributor

Anyone have the answer to this........

My server has been running fine for 4 months, the number of users on it increased last month and since then I've had this freezing issue, entirely random, no event logs.......before finding this forum I've already spent many days cleaning up the server removing any third party apps, moving it to a more powerful server etc, etc.........

This really sucks........

0 Kudos
NikolaA
Contributor
Contributor

Is there any update on this ?

0 Kudos
Whitmore29
Contributor
Contributor

I'll echo this... any update? having this exact issue with 4.1 ESXi + Server 2008R2 and Exchange 2010.

0 Kudos
supportreach
Contributor
Contributor

We have found leaving the server without any user logged-in has given better stability. Let me know if this helps anyone else...

When I had automatic login enabled I would crash every week, sometimes every-day. Now I have been fine for quite some time.

I am using pingability.com and/or monastic.com to check my OWA to alert me when the Exchange server is "frozen"... this way I get a text on my Phone and in the email so I know to fix it asap.

0 Kudos
Whitmore29
Contributor
Contributor

This seems to be true in my case as well. It seems that lock ups occur during maintenance work via RDP or shortly following. We use Nagios to notify on ping outage and address the issue as quickly as possible however this is still a HUGE inconvienience... VMware needs to address this issue for their customers.

0 Kudos
Gasaraki
Enthusiast
Enthusiast

Same problem, just finished Exchange 2010 on Server 2008R2, downloaded all patches, newest VMWare tools, all Windows Updates.

Machine is unresponsive doing anything, plenty of free resources, certain areas of the OS is not accessable, login problems with domain admin users... etc, etc, etc.

I'm running ESXi Advance 4.1 U1

Don't know what the hell is wrong with this machine.

EDIT: Just tried the video driver change. No effect on problem.

0 Kudos
hansherlighed
Contributor
Contributor

Just curious...

What Antivirus(If any) have you installed on the server? I know of some issues with ESET installed on some Servers causing random freezes like the once you are descirbing.

I'm having another strange issue since updating ESXI to the latest 4.1 U1 patch.

Now I'm not able to start thte VM in safe mode. The VM boots as it should, and I can get to the Windows logon screen, but then the VM restarts. NO entrys in the Windows Event log, nothing usefull in the "message" or "hostd" log files in ESXI.

At first I thought it had to be something with the current VM that was corrupted, but I have now tried a complete fresh install of Server 2008R2 using the naitive ESXI template for Server 2008R2 and nothing else. No Windows updates nothing, just a fresh install and then a reboot using F8 to get into safe mode. Still it just reboots. I have tried installing VM tools to the new VM to see if that would help, but no.

Can any of you confirm that you have the same problem with ESXI 4.1U1 and Server 2008R2?

I have no problems with Server 2003 or Server 2008 or Win7, it's only Server 2008R2?

0 Kudos