We have a number of Windows VMs that are behaving very strangely. The servers do not respond to a ping or via RDP until you log on via the console in Virtual Centre, as soon as you do they start responding again.
I have a feeling that this has come from a particular template but I cannot for the life of me work out how. My concern is that this happens with one of our new Production VMs and damages the reputation of our virtual environment, a reputation we've spent so long building.
Any help would be much appreciated
Thanks in advance
I have seen issues like this caused by the windows firewall, are you using the firewall? disable it in your template if you can. We had issues with a GPO applying to set the firewall to allow ping and rdp through....
I'm afraid that was our first port of call. It's disabled on all machines this happens to and is disabled on the template aswell.
It's almost as if it's going to sleep and only the console logon will wake it up
I'm having the same issues on an ISA2004 server. I figured it's a firewall thing, some service not starting untill you login. But you see this probem on a number of VMs, which (my guess) are not all ISA servers?
I assume you have this problem on windows servers. Perhaps you could try to look at the services of such a server from another machine (VM or physical), by using my computer.... manage... right click computer management (loca), and choose "connect to another computer". Hopefully connecting WILL work, and you can then see inside the VM if all services are running as expected...
Also, make sure all services run under the system account, that might also cause thiese kind of problems...
If this is hapenning to only few VMs (not all on the same host/cluster) then have you tried looking into vmware.log?
Also as you said that machines start responding after login to console in VC so are they start responding after just login to VC console or do you open Vm's console then only they start working?
Please attach vmware.log file of affected VMs.
MCSE 2003, VCP, CCNA, ITIL Foundation
Is there some kind of power management enabled? You should definitely review the vmware.log. Are vmware tools installed? You should also check your vSwitch policies, if all of the vm's are connected to the same vSwitch.
Guys thanks for all your responses
I have spoken with my colleagues and we believe all machines are on the same ESX host as well as on the same VSwitch and port group. I have examined the vmware.log of one of the machines that was causing issues.
The following errors are in the vmware.log for the machine I had issues with this morning
May 30 10:26:55.680: mks| SOCKET 5 recv error 5: Input/output error
May 30 10:26:55.680: mks| SOCKET 5 destroying VNC backend on socket error: 5
The following post details almost the exact same issue, the solution from VMWare however is very specific. I might just have to log a call with VMWare to be completely safe on this one
I have not fully isolated the issue but, i had the same issue but, I was able to narrow it down to a specific host server in a cluster of 3 servers. VM's on just this one HOST server would drop off the network. I made some change to the Host Configuration and will let you know what options I found out are causing this. I enabled a few features to see if this resolves this individual host which, is an identical server to another in the 3 HOST cluster using Intel EVC Sandy bridge mode config. I will update with my findings for any others looking at this post.