I am having a problem with VMs in my VMWARE ESX 3.0.1 environment. VMs freezes with black screen and when this happens CPU utilization will be high. The only way I can bring back my VMs is by resetting.
This is happening frequently in my production environment and vmware support guys have no clue why this is happening and have been working on this from past one month.
Envoronment : HP DL580 G4 , guest VMs running Windows 2003 STD SP1 with Citrix Presentation Server
Please find the vmware.log
Apr 03 01:27:58.948: vcpu-0| PIIX4: PMAccessPM got ACPI S1 request
Apr 03 01:27:58.948: vcpu-0| Msg_Hint: msg.piix4pm.guestInS1 (not shown)
Apr 03 01:29:08.226: vcpu-0| PIIX4: PM Resuming... (from S1 (0x4))
Apr 03 01:29:08.346: vcpu-1| SVGA: Unregistering IOSpace at 0x1060 (0x1060)
Apr 03 01:29:08.346: vcpu-1| SVGA: Unregistering MemSpace at 0xf8000000(0xf8000000) and 0xf4000000(0xf4000000)
Apr 03 01:29:08.347: vcpu-1| SVGA: Registering IOSpace at 0x1060 (0x0)
Apr 03 01:29:08.347: vcpu-1| SVGA: Registering MemSpace at 0xf8000000(0xf8000000) and 0xf4000000(0xf4000000)
Apr 03 01:29:08.494: mks| SVGA: Using extended FIFO: Caps 0x00000007, Flags 0x00000000
Apr 03 01:29:08.495: mks| HostOps hideCursor before defineCursor!
Apr 03 01:29:08.543: mks| MKS remote display status changed, enabling remoteoptimizations
Apr 03 01:30:08.481: vcpu-0| PIIX4: PMAccessPM got ACPI S1 request
Apr 03 01:30:08.481: vcpu-0| Msg_Hint: msg.piix4pm.guestInS1 (not shown)
Apr 03 01:30:25.981: vcpu-0| PIIX4: PM Resuming... (from S1 (0x4))
Apr 03 01:30:26.056: vcpu-1| SVGA: Unregistering IOSpace at 0x1060 (0x1060)
Apr 03 01:30:26.056: vcpu-1| SVGA: Unregistering MemSpace at 0xf8000000(0xf8000000) and 0xf4000000(0xf4000000)
Apr 03 01:30:26.057: vcpu-1| SVGA: Registering IOSpace at 0x1060 (0x0)
Apr 03 01:30:26.057: vcpu-1| SVGA: Registering MemSpace at 0xf8000000(0xf8000000) and 0xf4000000(0xf4000000)
Apr 03 01:30:26.178: mks| SVGA: Using extended FIFO: Caps 0x00000007, Flags 0x00000000
Apr 03 01:30:26.179: mks| HostOps hideCursor before defineCursor!
Apr 03 01:30:26.244: mks| MKS remote display status changed, enabling remoteoptimizations
Apr 03 01:31:26.112: vcpu-0| PIIX4: PMAccessPM got ACPI S1 request
Apr 03 01:31:26.112: vcpu-0| Msg_Hint: msg.piix4pm.guestInS1 (not shown)
Apr 03 01:33:08.106: vcpu-1| PIIX4: PM Resuming... (from S1 (0x4))
Apr 03 01:33:08.271: vcpu-1| SVGA: Unregistering IOSpace at 0x1060 (0x1060)
Apr 03 01:33:08.271: vcpu-1| SVGA: Unregistering MemSpace at 0xf8000000(0xf8000000) and 0xf4000000(0xf4000000)
Apr 03 01:33:08.273: vcpu-1| SVGA: Registering IOSpace at 0x1060 (0x0)
Apr 03 01:33:08.273: vcpu-1| SVGA: Registering MemSpace at 0xf8000000(0xf8000000) and 0xf4000000(0xf4000000)
Apr 03 01:33:08.464: mks| SVGA: Using extended FIFO: Caps 0x00000007, Flags 0x00000000
Apr 03 01:33:08.464: mks| HostOps hideCursor before defineCursor!
Apr 03 01:33:08.494: mks| MKS remote display status changed, enabling remoteoptimizations
Apr 03 01:34:08.391: vcpu-0| PIIX4: PMAccessPM got ACPI S1 request
Apr 03 01:34:08.391: vcpu-0| Msg_Hint: msg.piix4pm.guestInS1 (not shown)
Apr 03 01:35:08.413: vcpu-0| PIIX4: PM Resuming... (from S1 (0x4))
Apr 03 01:35:08.642: vcpu-1| SVGA: Unregistering IOSpace at 0x1060 (0x1060)
Apr 03 01:35:08.642: vcpu-1| SVGA: Unregistering MemSpace at 0xf8000000(0xf8000000) and 0xf4000000(0xf4000000)
Apr 03 01:35:08.643: vcpu-1| SVGA: Registering IOSpace at 0x1060 (0x0)
Apr 03 01:35:08.643: vcpu-1| SVGA: Registering MemSpace at 0xf8000000(0xf8000000) and 0xf4000000(0xf4000000)
Apr 03 01:35:08.802: mks| SVGA: Using extended FIFO: Caps 0x00000007, Flags 0x00000000
Apr 03 01:35:08.803: mks| HostOps hideCursor before defineCursor!
Apr 03 01:35:08.812: mks| MKS remote display status changed, enabling remoteoptimizations
Apr 03 09:16:21.025: mks| Ignoring update request in VGA_Expose (mode change pending).
Apr 03 09:20:04.683: mks| SOCKET 7 recv error 5: Input/output error
Apr 03 09:20:04.683: mks| SOCKET 7 destroying VNC backend on socket error: 5
Is there anything going on in the service console, high cpu/memory? If the service console gets hosed it could affect vm's.
Kind of hard to say, is it all vm's and all OS's withing those vm's? Are you using alot of resources on the host?
I too have had this issue happen to me, but only on one specific Guest OS. I have not yet updated to 3.0.1 though. I am running 4 servers, all Win2k3 SP2. The one OS in question is 32bit. The only resource intensive service running on that OS is a SQL engine for my Veritas Backup Exec. I am looking into moving the backup exec database off of that box and removing the SQL server service. Aside from that It is used for a file server. I cannot isolate a specific trigger process from within windows, and the symptom is so rare that it's hard to catch, and impossible to recreate. The Support team at VM suggested to turn off Hyperthreading and reduce the number of VCpu's to that machine but my other guest OS's are running with the full functionality of my processors and do not have a problem, so i hesitate to take such drastic measures. I hope that I can be of some assistance finding the cause of this annomoly. I am only glad to see that this is not isolated to just my one VM.
CPU and Memory utilization is not much on both service console and host.
I have checked resource usage, everything is normal. All the VM's are running windows 2K3 SP1 and this is happening on most of the VM's.
CPU and Memory utilization is not much on both service console and host.
I have checked resource usage, everything is normal. All the VM's are running windows 2K3 SP1 and this is happening on most of the VM's.
I can think about the following w.r.t. your problem right now:
There could be many reasons for this like LUN locking, VMware Tools not updated after applying some patches etc. How is your storage designed? Zoning in place if you have a SAN Switch? Hope you have not over committed the resources. When you say CPU utilization is high, whose CPU utilization? VM or the ESX host? Server/Storage Firmware up-to-date? Check the best practice for BIOS/Virtualization/64-Bit setting for your CPU type.
Looks like power management... (I know, strange). Are you running a screensaver or anything like that? If so, turn it off. I'd even go as far as to check the BIOS of the VM itself and make sure it's not doing anything with APM or ACPI to shut down devices.
Paul
Hi,
Thanks for the reply.
Iam not using any storage, everything is local.
CPU behaviour when the system hangs with black screen.
1. Have assigned 2 vCPU's per VM's
2. when the system hangs, utilization on vCPU0 will be utilized 100% and utilization on vCPU1 will be around 5 to 10%
I have already done the following without any sucess.
1. Disable Memory ballon driver
2. Memory and CPU reservations
3. Disable screensaver
4. Standby set to Never in power managment
Sreenath
Hi,
I had this error twice a whole ago. Both VM's had W2K3 SP1. Before this error happened to the first VM I had added on CPU and I wasn't able to get the VM running again always black screen and first CPU at 100%. I found out that there was a problem with the HAL.
So I did the following:
\- boot the VM with a W2K3 SP1 CD
\- copy the HAL DLL file you need to your system32 folder
\- edit the boot.ini with the name of your HAL DLL file e.g. /HAL=halmps.dll[/b]
\- reboot the VM
Hi,
In my case i have not upgraded from uniprocessor to multiprocessor.
All the VM's are built with 2 vCPU's.
Regards
Sreenath
Use 2 vCPU's only if required., if the application you are running on that VM really makes use of it.
Otherwise, 1 should be sufficient.
-Dheeraj.
there must be a special reason do give 2 vCPU, otherwise 1 should be selected.
I have the same problem. A VM freezes an need to reset it to get it working.
ESX 3.0.1 fully patched all local HD. HP DL360 G5.
2 VMs running on it (W2k3 R2 32bits). Only 1 experiencing this issue. The one experiencing the issue is DC, file server, Backup server with tape drive on the ESX host mapped to that VM with backup exec 11D and symantec anti virus server.
If you are using 2 vCPUs and a uniprocessor kernel, that could be the problem. A single vCPU and multiprocessor kernel is supported, but not the other way around. And why not either take away a vCPU, or upgrade to multi?
The only black screen lockups I get are due to virtual memory running out on the guest due to a leak in McAfee Virusscan.
Scott
Hi,
The issue for VM freeze in my case was due to standby.
After making the desired changes
in registry to disable standby , My VM is running without any issues
The below script disables standby.
Windows Registry Editor Version 5.00
\[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ACPI\Parameters]
"AMLIMaxCTObjs"=hex:04,00,00,00
"Attributes"=dword:00000070
\[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ACPI\Parameters\WakeUp]
"FixedEventMask"=hex:20,05
"FixedEventStatus"=hex:00,84
"GenericEventMask"=hex:18,50,00,10
"GenericEventStatus"=hex:10,00,ff,00
Let me know if this resolves your issue....
Regards
Sreenath