VMware Cloud Community
s1xth
VMware Employee
VMware Employee
Jump to solution

W2K3 Domain Controller Becomes Unresponsive - No Errors in Event Log

So I have a weird issue with one of my domain controllers, it has happened twice now. The machine just becomes unresponsive. I try to RDP to it and it fails, but I CAN PING the server with a response. When I go to the Console the screen is just green, no login prompt nothing. I have to shut the machine down (hard shut down) and power it back up. Thankfully this is not a primary DC, just a GC, DNS but I cant seem to figure out what is causing this. Its really odd because it was working fine for almost a month, and then this started. Anyone have any ideas? Ever seen something like this?

Thanks in advance!

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi
Reply
0 Kudos
1 Solution

Accepted Solutions
CiscoKid
Enthusiast
Enthusiast
Jump to solution

If it has been a proven success with other domain controllers both virtual and physical, I would say that you should be fine with that strategy. Its VSS or the guest quiescing that causes the most issues with backups of DC VM guests.

View solution in original post

Reply
0 Kudos
8 Replies
mlubinski
Expert
Expert
Jump to solution

Please check if before you noticed this issue, this VM has been migrated (either manually, or by DRS). If so, then It can be caused by some storage problems, high load of VMs on one datastore, not using VMware Tools.

also, please try to find out at what time did it happen (start) and try to take a look into /var/log/vmkwarning and /var/log/vmkernel, you should probably see some messages about swapping for this vm.

[I]If you found this or any other answer useful please consider the use of the Helpful or correct buttons to award points[/I]
Reply
0 Kudos
CiscoKid
Enthusiast
Enthusiast
Jump to solution

Are you performing any kind of snapshotting and/or VCB backups on the domain controller? If so, it could be EDB corruption of the AD DB.

Reply
0 Kudos
s1xth
VMware Employee
VMware Employee
Jump to solution

Ciscokid - weird you mentioned that, about a half ago I found something in the event log on the machine, a VSS error that was bad, immediatly after that everything became unresponsive. I took a look at my logs from a week ago (when it happened last time) and the same error was there. I have an Acronis backup job configured with VSS enabled and it looks like something is causing to not work correctly and causing the AD DB to get 'wacked' and the machine locks. I am hoping that is what it is, it seems very possible. I am going to disable the VSS part of the job and see how things are. I DOUBT this is a VM issue (always a MS issue!!! ehhh!)

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi
Reply
0 Kudos
CiscoKid
Enthusiast
Enthusiast
Jump to solution

We had experienced a similar issue and long story short we lost 2 of 3 Domain Controllers that were VM guests. We ended up abandoning using snapshots using VCB/vRanger on all of our DCs because of the corruptions it caused. I would be willing to bet that this is your issue and nothing to do with it being a VM guest. FYI, we currenlty use a Quest Archive manager for DC backups and nothing that requires quiescing the OS to perform backups. This also may play an affect on Exchange databases as it also leverages the ESE type of database structure. Please reward points if you feel that this information is helpful. Thanks.

s1xth
VMware Employee
VMware Employee
Jump to solution

Would you not recommend using NTBackup to backup the system state? I am using that with no errors. I disabled VSS in Acronis on the DC's that are VM's.

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi
Reply
0 Kudos
CiscoKid
Enthusiast
Enthusiast
Jump to solution

If it has been a proven success with other domain controllers both virtual and physical, I would say that you should be fine with that strategy. Its VSS or the guest quiescing that causes the most issues with backups of DC VM guests.

Reply
0 Kudos
TomHowarth
Leadership
Leadership
Jump to solution

For DCs in environment I design I tend to recommend that system state is backedup to another machine in the domain. DC's although important can in a virtual world be left out of standard backup proceedures, recover is as simple a laydown a new VM --> DCPromo (none catastrophic) or laydown new VM --> DCPromo --> Restore System State (catastrophic)

It is the recovery of the system state that is important, not the physical Guest.

If you found this or any other answer useful please consider the use of the Helpful or correct buttons to award points

Tom Howarth VCP / vExpert

VMware Communities User Moderator

Blog: www.planetvm.net

Contributing author for the upcoming book "[VMware vSphere and Virtual Infrastructure Security: Securing ESX and the Virtual Environment|http://my.safaribooksonline.com/9780136083214]”. Currently available on roughcuts

Tom Howarth VCP / VCAP / vExpert
VMware Communities User Moderator
Blog: http://www.planetvm.net
Contributing author on VMware vSphere and Virtual Infrastructure Security: Securing ESX and the Virtual Environment
Contributing author on VCP VMware Certified Professional on VSphere 4 Study Guide: Exam VCP-410
s1xth
VMware Employee
VMware Employee
Jump to solution

Thank you all for your assitance and insight on this issue. I am thinking I have found the problem with the VSS related errors. I have awarded points to all. Thanks again!

http://www.virtualizationimpact.com http://www.handsonvirtualization.com Twitter: @jfranconi
Reply
0 Kudos