Hi guys, I'm just in the process of evaluating ESX 3.5 Update 1, on a Dell 1850 server. This server has been running OpenSUSE in a production environment for a number of years now without any lockups or crashes. I have 3 vms installed on it (local storage SCSI raid), openSUSE 10.3, 11.0 and a novelty Windows XP which is used primarily for accessing the VI client, and the DRAC etc.
At the time of the crash, I was using WinSCP on the Windows XP guest, to copy a couple of ISO images off the 10.3 guest.
I might add this crash is quite disturbing, and although it could be a hardware problem (memory or chipset issue), it probably isn't, given that it has never crashed before up until this point.
bumping for sheer fascination
Update:
System has passed a full memtest using memtest86 3.4a. The server is in the list of officially supported hardware.
I tried copying the same iso file from the OpenSUSE 10.3 guest to the Windows XP guest, again with WinSCP, and again got the purple screen of death. I will attach this crash dump as well for comparison.
ESX stresses machines very diffirently, and a PSOD is almost always a hardware issue.
Makre sure you have updated the BIOS on that 1850, as well as the BMC and SAS backplane firmwares.
--Matt
Hello,
As mcowger states, VMware ESX stresses systems more than any OS out there. I would do the following:
Ensure the BIOS is updated to the latest (per Dell Instructions) for system and PCI cards, etc.
Ensure that all components within the box are on the HCL (specifically RAID and other PCI/PCIe/PCI-X cards)
Ensure the BIOS is configured for ESX per Dell instrcutions.
Run Hardware DIags from Dell for at least 24-48 hours
Run memtest86 for at least 24-48 hours
Generally I see these issues when the BIOS has not been updated.
Best regards,
Edward L. Haletky
VMware Communities User Moderator
====
Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.
CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354
As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization
Both exceptions were page faults with the vm. Have you checked the console memory? Are you seeing memory utilization as high?
-KjB