VMware Cloud Community
Robby68
Contributor
Contributor

Windows Server 2012 VM becomes unresponsive / VW Tools "Not Running"

I have 2 ESXI 6.0.0 build 3073146 hosts running on IBM x3400 series with local storage, on which i have installed a bunch of virtual machines.

All VMs are running Windows Server 2012 Std.

Occasionally and absolutely randomly, some virtual machines become unresponsive. vMware Tools (v. 9.10.5, build-2981885)

shows as "Not Running" in the Summary Pane and the virtual machine is unaccessible via RDP or Console. What i can

basically do in that condition is Power Off the VM.

I tried to uninstall and re install VM Tools, tried to change Network Card (currently VMXNET 3) to E1000 or E1000E, but

the issue is still persisting.

The issue has started appearing after the upgrade of ESXi 5.5 to the firt version available of ESXi 6.0.

I had no issues for a couple of weeks then, suddenly, the problem came out again. This happens only on a subset (2 on a total of 😎 of the virtual machines running.

Did someone have a similar issue? Any suggestion on how to solve it?

Thanks

Roberto

(Bologna - Italy)

Tags (2)
21 Replies
chill
Hot Shot
Hot Shot

I have experienced this numerous times with Windows 2012, Windows 2012R2 and sometimes even Windows 2008 R2.

The only solution so far has been to hard reset.

I'm hoping someone responds to this with a solution

If you find this information helpful or correct, please consider awarding points.
Reply
0 Kudos
jdevall
Contributor
Contributor

We are having the same problem. It only appears to happen on 2012 R2 VMs. It is completely random and even occurs on servers with little to no utilization. The only option is to power cycle the server. It has happened on 10-15 different VMs so far.

Sauske
Contributor
Contributor

Hello, got same problems with  server 2012 r2 and VMware tools

I got ESXi 6.0 U2 for HP servers, and when I install VMware tools they works as they want, on some servers all ok till server reboot then it "Not Running" in the Summary Pane, so it's random working or not , I even try to install older version of VMware tools but it still not working properly

Thx jdevalljdevall for his reply, power off the VM and then turn it back makes VMware tools working (it still works after VM reboots) so it seems right now the only solution

Reply
0 Kudos
cesprov
Enthusiast
Enthusiast

What NIC drivers are you guys using in the VMs?  The newer e1000e driver had issues initially and I'm still not even sure they have been addressed yet.  Nothing has been updated on this link either:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=21099...

I've been on several of the 6.0.x builds (my current = 6.0U1a) and I haven't had any issues, but I am using the VMXNET3 driver in all my VMs.  I believe when you choose Windows 2012 as the OS during a new VM install, it defaults to the e1000e driver.

Reply
0 Kudos
prakkumar
Contributor
Contributor

I am also facing exactly same problem and In my case my VM does NOT have e1000 NIC on it, Still it becomes unresponsive almost everyday and I have no other option than to Reset the VM.

Details of one of the server running with this issue:

OS : Windows Server 2012 R2

VM Version : vmx-10

CPU : 4 vCPU

Memory : 16 Gigs

NIC : VMXNET3

Host : ESXi 5.5.0, 2456374. (HP Proliant BL460c Gen9, 24 CPU x 2.497 GHz)

Looking forward for someone to post a resolution to this.

Prakash Kumar
Reply
0 Kudos
eric86
Contributor
Contributor

I am facing this issue on several Server 2012 RS VMs as well, very similar specs to prakkumar, except ESXi 6.0.0.

Has anyone found a resolution?

Reply
0 Kudos
lukehsieh
Contributor
Contributor

We have 1 VM (out of hundreds) in our environment currently experiencing this issue. 


Host is Cisco UCSB-B200-M3, running ESXi 6.0.0 build 3073146

VM is Server 2008 R2, with 2 socket/3cores, 12GB RAM.

NIC: VMXNET3

VMTools is 9.10.5 buid 2981885


This VM was recently migrated over from Hyper-V by using VMware converter.  Initially we thought it may have been Mcafee related (as the security team was upgrading Mcafee in our environment at the same time.  But server is still freezing with Mcafee removed.  We will go through days where the VM is stable and then other days it freezes up several times throughout the day.

Reply
0 Kudos
arjun142
Contributor
Contributor

I had same issue on Windows server 2012 but it was resolved after reinstall the VM tools in my case.

Reply
0 Kudos
LucianoPatrão

Hi

There is an issue in 5.5 and 6.0 with Windows 2008 and 2012. Check if you have the last ESXi updates.

There is also some workaround and also how to fix it in the Gust OS

Check the KB for this issue: https://kb.vmware.com/kb/2092807

Maybe this can fix your problems

Hope this can help

Luciano Patrão

VCP-DCV, VCAP-DCV Design 2023, VCP-Cloud 2023
vExpert vSAN, NSX, Cloud Provider, Veeam Vanguard
Solutions Architect - Tech Lead for VMware / Virtual Backups

________________________________
If helpful Please award points
Thank You
Blog: https://www.provirtualzone.com | Twitter: @Luciano_PT
Reply
0 Kudos
nlopezs
Contributor
Contributor

We are having similar issues with our Windows 2012 R2 and vCenter 6 ESXi 6. I opened a ticket with VMware support. The support engineer looked at our host lost and suggested removing vmware vss driver from vmtools of the affected machine. He is thinking that backups might be causing the issue. We are still testing.

I am also capturing memory dumps of unresponsive machines by suspending the machine then downloading .vmss and .vmem files from the datastore.

This kb describes the process:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=20039...

Grag vmss2core vmware labs to converts those files into memory.dmp.

Vmss2core – VMware Labs

The resulting file can then be analyzed using Windgb which is part of the Windows 10 SDK

Windows 10 SDK - Windows app development

Reply
0 Kudos
getvc
Contributor
Contributor

Hi,

Are you sure it was VM hung not GuestOS panic? Any logs could share with?

I got same issue with win2012, it is using e1000 driver and from log it showed vmware tools has some issue.

Got the GuestOS dump and MS was saying it's driver panic, ask me to check with hardware vender.

Suspect network driver cause the panic.

If you could share logs with us that would be good for us to know more detail about the issue.

Best Regards,

Reply
0 Kudos
fgl
Enthusiast
Enthusiast

Question to all that are having the problem.  Do you have SEP (Symantec Endpoint Protection) anti-virus installed on these servers?

I had similar issue and after doing a lot of tracing and log reviewing I noticed that every one of my servers that froze had a SEP definition update and within 3 minutes the server froze and had to be hard power off and back on.  If you have SEP installed check the SEP client log under applications and services logs in event viewer and see if you notice a gap between when the server froze and when you rebooted the server.  This log entry time will correspond with time entries in the system and application logs within 3 minutes or so when you have no entries until you rebooted the server.

My resolution was to uninstall SEP from the servers and I have not had anymore freeze since.  I don't know if something change in SEP but my servers has had SEP on them for years and never encounter this problem until early February and then I was getting 1-2 frozen servers each week until I uninstalled SEP and I have not had another freeze since early March.

If somebody thinks it's something else I'm all ears but SEP was the only commonality (within 3 minutes of a SEP update) my servers had in common.  The one thing I was to point out is that all my unresponsive servers were still pingable but nothing else was responding, no cntl-alt-del, no rdc, nothing.

eric86
Contributor
Contributor

I think you just solved this! At least for me. The SEP definition update events occurred minutes before the freeze-ups, just like you said.

But I noticed something else. All the affected server were running version 12.1.4xxx. Upgrading to 12.1.6xxx should solve this.

a11112f
Contributor
Contributor

I'm running SEP 12.1.6 MP4 on many server 2012 R2 VMs and I have not seen any problems.  I'm also running the lastest ESXi 6 Update 2.

I just checked and for 90% of my servers I don't have a scan or auto-protect exception for the VMWare folder either.

What I am having problems with is the VMWare Snapshot Provider service not stopping after server startup (manual start), this ruins my backups as it takes out the VMWare tools service.

Reply
0 Kudos
cyclopsio
Contributor
Contributor

we are having the same issue (Windows 2012 R2 / SEP Client 12.1.4)

All the affected server were running version 12.1.4xxx. Upgrading to 12.1.6xxx should solve this.

>>is there any document mentioned this issue?


thanks a lot

Reply
0 Kudos
copelsimo1
Contributor
Contributor

Hi to all.

In my company we have the same issue: random unresponsive server (2012/2012r2)

We have ESXI 6.0 up.2

We opened different support request (VmWare, Microsoft, etc) but no one tell us why this happened, and no solution.

Then crossing different tables from different console, i noticed that all unresponsive server had same sep version (12.1.2.x).

So i open a technical call to Symantec, and meantime i started to distribute last update of sep client (at time 12.1.6.x). This update require a system reboot, so only 30-40% of systems have been updated in the first step.

Symantec tell me we had old version of SEP, and requested us full Microsoft dump to analize (but this require reboot,too!) as well as update all client version.

No one server with last SEP version (21.1.6.x) got unresponsive.

At the end, Symantec confirm us problem was right in SEP version:

@- Fix ID: 3590578
@ Symptom: System freezes due to a deadlock in File System Auto-Protect driver after updating virus definitions.
@ Solution: Modified File System Auto-Protect driver to avoid this deadlock.

So, UPGRADING SEP TO LAST VERSION, PROBLEM SOLVED.

I hope to have helped.

Simone

Alba(CN)

Reply
0 Kudos
n_c_vmware
Contributor
Contributor

We have this issue too, VMware support said there is a bug and it should be fixed in the June release. They don't have any official doc on this bug or ready to tell the public.  They collected 2 dump files and analyzed it and came with this conclusion.

To temporary fix the Snapshot Quesing the OS you can do this.

1.Stop the VMware Tools service
2.Stop the VMware Snapshot Provider service if it’s running
3.Stop the Volume Shadow Copy service if it’s running
4.Restart the Cryptographic Service
5.Start the VMware Tools service (automatically starts Snapshot Provider and VSS)
6.Attempt to create a quiesced snapshot
Reply
0 Kudos
rocordov
Contributor
Contributor

Has anyone resolved this?

Reply
0 Kudos
kamalive
Contributor
Contributor

Thank you so much for this. I just had the same issue. This is the 4th time. After reading your post I can confirmed that we are having the same issue. I will try the update. If issues comes up, I'll post back.

Reply
0 Kudos