dbvuetec
Contributor
Contributor

ESX 3.5 Update 1 - PSOD (Crash)

Hi guys, I'm just in the process of evaluating ESX 3.5 Update 1, on a Dell 1850 server. This server has been running OpenSUSE in a production environment for a number of years now without any lockups or crashes. I have 3 vms installed on it (local storage SCSI raid), openSUSE 10.3, 11.0 and a novelty Windows XP which is used primarily for accessing the VI client, and the DRAC etc.

At the time of the crash, I was using WinSCP on the Windows XP guest, to copy a couple of ISO images off the 10.3 guest.

I might add this crash is quite disturbing, and although it could be a hardware problem (memory or chipset issue), it probably isn't, given that it has never crashed before up until this point.

Tags (3)
0 Kudos
5 Replies
RussellCorey
Hot Shot
Hot Shot

bumping for sheer fascination

0 Kudos
dbvuetec
Contributor
Contributor

Update:

System has passed a full memtest using memtest86 3.4a. The server is in the list of officially supported hardware.

I tried copying the same iso file from the OpenSUSE 10.3 guest to the Windows XP guest, again with WinSCP, and again got the purple screen of death. I will attach this crash dump as well for comparison.

0 Kudos
mcowger
Immortal
Immortal

ESX stresses machines very diffirently, and a PSOD is almost always a hardware issue.

Makre sure you have updated the BIOS on that 1850, as well as the BMC and SAS backplane firmwares.

--Matt

--Matt VCDX #52 blog.cowger.us
0 Kudos
Texiwill
Leadership
Leadership

Hello,

As mcowger states, VMware ESX stresses systems more than any OS out there. I would do the following:

Ensure the BIOS is updated to the latest (per Dell Instructions) for system and PCI cards, etc.

Ensure that all components within the box are on the HCL (specifically RAID and other PCI/PCIe/PCI-X cards)

Ensure the BIOS is configured for ESX per Dell instrcutions.

Run Hardware DIags from Dell for at least 24-48 hours

Run memtest86 for at least 24-48 hours

Generally I see these issues when the BIOS has not been updated.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIII: 2009-2021,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
kjb007
Immortal
Immortal

Both exceptions were page faults with the vm. Have you checked the console memory? Are you seeing memory utilization as high?

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos