VMware Cloud Community
dahvaio
Contributor
Contributor

Virtual Server - Almost Freezes but isn't Frozen - Very Odd

I have a 5 Node ESX 3.5U5 DRS/HA Cluster with about 28 Virtual Servers all running Windows 2003 SP2. All 28 Servers basically run a single application with various loads across the different servers. I have had VMware Support already review the vCenter and ESX Host Logs and they were unable to find any anamolies within the Logs; therefore, the issue could be the Application, Virtual Server and or something else.

So far in the past 2 weeks on a specific day, one of the Server Almost Freezes for roughly 8 minutes. The CPU and Network Activity on the Server drops but everything else stays the same.

Things I CAN do on the server:

1. Ping the Server with responses

2. Access the Server through VM Console

3. Click and Move around any open Dialog Boxes

4. Double Click on My Computer; however, no Drives are listed

Things I an CANNOT do on the server:

1. Unable to get an RDC Connection

2. Servers stops responding to SNMP requests

3. Unable to access the start menu or any of th ICONS of running programs; even though, all are there.

The server is a basic Windows 2003 SP2 with 300GB RDM to an EMC FC SAN. All of the other servers within the cluster are fully functional but since this one server is the CENTRAL Server (Hub and Spoke), the other servers CPU and Network do decrease.

I have been racking my head over why one SERVER would freeze intermittently and only this server. VMware Support does not believe it is an issue with the ESX Hosts and or the Virtual Server.

What does not make sense is that the CPU/NETWORK both drop to like 5% and very low Network Traffic respectively. If there was a process then the CPU should be SPIKING but it does not. There is no memory paging because there is more than enough Physical RAM to accommodate all of the assigned RAM on the Virtual Machines.

Anyone have any ideas as to why a Server would almost "FREEZE" but not necessarily freeze?

0 Kudos
2 Replies
Cooldude09
Commander
Commander

this is happening coz of underlying storage issues. You need to check with emc and see if the alrams r generated oin the storage box

If U find my answer useful, feel free to give points by clicking Helpful or Correct.

Subscribe yourself at walkonblock.com

0 Kudos
dahvaio
Contributor
Contributor

EMC is currently reviewing a 48 Hour Scan that was performed on the SAN; however, there never was an incident during those 48 hours. Yesterday, an incident occurred and I was able to capture some of the performance graphs from vCenter and SCOM. The one graph which seemed very odd is the graph "Disk Read Rate"  and "Disk Write Rate".

The "Disk Read Rate" has a huge spike while the "Disk Write Rate" completely drops. Please look at the graph.

0 Kudos