VMware Cloud Community
bpfx
Contributor
Contributor

VM's powering down unexpectedly

Several of my VM's on a ESXi 4 machine have been powering down unexpectedly. The only problem I can see:

Message from virt01.hangar.bpfx.org: ***

VMware ESX internal monitor error ***

vcpu-0:VMM64 fault 2: src=MONITOR

rip=0xfffffffffc23e824 regs=0xfffffffffc008f40

Please report this problem by selecting menu item Help > VMware on the Web > Request Support, or by going to the Web page

"http://www.vmware.com/info?id=8&logFile=/vms/volumes/4a42ce99%2d13e375c1%2d3a95%2d001b21106b57/node%2da/vmware%2elog&coreLocation=/vmfs/volumes/4a42ce99%2d13e375c1%2d3a95%2d001b21106b57/node%2da/vmware%2dcore%2egz%2c%20/vmfs/volumes/4a42ce99%2d13e375c1%2d3a95%2d001b21106b57/node%2da/vmware64%2dcore%2egz". Please provide us with the log file(/vmfs/volumes/4a42ce99-13e375c1-3a95-001b21106b57/node-a/vmware.log) and the core file(s) (/vmfs/volumes/4a42ce99-13e375c1-3a95-001b21106b57/node-a/vmware-core.gz, /vmfs/volumes/4a42ce99-13e375c1-3a95-001b21106b57/node-a/vmware64-core.gz, /vmfs/volumes/4a42ce99-13e375c1-3a95-001b21106b57/node-a/vmx-zdump.000). If the problem is repeatable, please set 'Use Debug Monitor' to 'Yes' in the 'Misc' section of the Configure Virtual Machine Web page. Then reproduce the incident and file it according to the instructions. To collect data to submit to VMware support, run "vm-support". We will respond on the basis of your support entitlement. We appreciate your feedback, -- the VMware ESX team.

info 6/27/2009 11:59:21 AM

User

I think this is an old error, but I don't what I need to do to fix it. The patch I found for this issue is http://www.vmware.com/support/vi3/doc/esx-2066306-patch.html

I haven't applied the patch as it's a tgz rpm, and not a zip that vihostupdate is expecting.

I'm also looking for some help with:

" If the problem is repeatable, please set 'Use Debug Monitor' to 'Yes'

in the 'Misc' section of the Configure Virtual Machine Web page"

I'm not sure where to do that at.

0 Kudos
5 Replies
DSTAVERT
Immortal
Immortal

What server are you running.[Is it on the hardware compatibility list.|http://www.vmware.com/resources/compatibility/search.php]

RAM CPU???

What OS for the VM's that are experiencing problems. RAM CPU etc assigned to them. Are all failing?

The patch you list is for another product.

-- David -- VMware Communities Moderator
0 Kudos
bpfx
Contributor
Contributor

It's a white box, but all the componets are on the HCL. Athlon X2, 8G RAM, Adaptec 2405 SAS raid with 2 volumes.

The VM's are a mix of Win 2k3, OpenBSD, FreeBSD, CentOS, Solaris 10 and OpenSolaris.

I've set the CPU/Memory reservations and limits for each vm, and it seems to be helping, not 100% yet.

The patch is for ESX, and I'm running ESXi, but I can't find any more information. I don't have a support contract for Vmware.

0 Kudos
DSTAVERT
Immortal
Immortal

The patch is for ESX 3.0

Have you over committed RAM and CPU to the point that the machines fail. Setting reservations and limits can lead to problems as well. Assign sensible amounts of RAM (what the OS actually needs) LINUX, BSD and Solaris will consume whatever they are given even if it isn't necessary. Don't assign multiple processors unless it is absolutely required based on actual testing.

-- David -- VMware Communities Moderator
0 Kudos
Scissor
Virtuoso
Virtuoso

The patch you list is for ESX 3.0.1, so it doesn't apply to your ESXi 4 installation.

Please attach the vmware.log file referenced in the error message and I'll see if I can see anything strange.

According to the error message you posted, the vmware.log file is located at (/vmfs/volumes/4a42ce99-13e375c1-3a95-001b21106b57/node-a/vmware.log).

I see that the Guest referenced in the error message is named "node-a". Does that mean that there is also a "node-a" system set up in some sort of cluster?

0 Kudos
bpfx
Contributor
Contributor

Fortunatly (or not) I'm also getting some core dumps created when the VM's power down as well.

I have all the VM's set to 256MB of RAM, and 1000mhz with 250mhz reserved for each machine (all the current VM's are openbsd or freebsd). I have tried unlimited and tightly controlled, and it doesn't seem to matter.

Yes, this machine will be apart of a cluster, but that won't be setup until after I can get this issue fixed.

0 Kudos