I'm about to open a support case on this, but I'd like to give a run down of our issues to see if anyone has experienced anything similar. We can't be the only ones who have seen this behavior.
Until recently, all our VMs were created with 1 processor, however we've had a few VMs with higher CPU demands so we decided to move to dual processors on a dozen new VMs.
These VMs began experiencing unexpected reboots right away. About once a week or so the VMs have been rebooting at random times throughout the day and night, all at different days and times. I tried creating a test VM that sat idle on a newly patched ESX host with updated VMWare Tools and it rebooted unexpectely as well. The stop errors seem to point to driver issues and IRQ conflicts, but we've tried both the out of the box Microsoft drivers and the AMD drivers available on Windows Update
The hardware these VMs are running on is 3 Sun Fire X4600 M2 boxes with 4 dual-core 64-bit AMD processors and 128GB of RAM.
Has anyone experienced anything similar in the past? Should we be looking at our hardware as the cause perhaps?
Are you able to maintenance mode a host at a time and boot it up with a memtest CD - leave the test running for a good 24/48 hours to ensure your RAM is tip-top.
Have the duel vcpu VM's got the multiprocess HAL installed ?
We have checked the memory with a Sun technician and found no errors. I guess I never even thought to check the HAL... the processors show up in device manager as Dual-Core AMD Opteron processors, under Computer I see "ACPI Multiprocessor X64-based PC". Would that be enough to verify the HAL is installed?
Thanks for the help