I had the same issue, although I could reproduce it by attempting to load Fedora 9. It would consistantly PSOD right after formating the volume. I deleted my VM and haven't worked up the courage to try it again on that server. I was previously using that VM with Centos 5.2 with no problem. Fedora 9 installed fine on my test ESXi box, though.
My problematic server is a white box/custom build with an Asus DSVB-D motherboard and 2 2.0ghz Xeon E5335 CPUs. BIOS is current. Everything else is stable and fortunately, I don't need to use Fedora 9.
I had the same issue the other day with a windows 2003 32bit guest runing on a Dell M600 Blade.
this time its PCPU7
do you know why this is happening.. I've seen one other time during setup. but the occurance seems to be erratic. and not on the same blade
Lot of these things change with the Version of ESX you are using and the kind of hardware you have. So every PCPU locked up PSOD is not the same.
Post the screenshot if you can i will try to figure out what caused an issue. Most of the PSODs are hardware related issues.
We too have come across this issue. In our case, we did a new installation of ESX 3i onto a Dell PowerEdge M600 Blade and everything was working fine until we came in the next morning and found that the ESX host had pink screen'ed.
Attached is the screenshot of the error messages displayed.
This has happened onto TWO blades in the same chassis so far. These are brand new blades and so I'm leaning towards this being an ESX 3i server issue but I can't rule out hardware yet.
ESX3i-error.jpg 50.1 K
Did you collect the vm-support dumps from the server? Can you post the /var/log/vmkernel for the crash time.
after lot of work and hedaches we found that problem was bios of server not updated. After bios update we did not experience any psod (til now...)
This is ibm reference of the problem that is caused by an issue in intel quad core processors
Hope this helps
Thank you for your reply.
We resolved our issue by removing the additional memory that had been installed in the server.
The Dell Blade server had a strict requirement that if all 8 slots were filled up with memory, then ALL the installed memory must be the same size + speed + type.
We had tried to mix different sized memory in order to maximise the amount of memory installed in the box and make use of the spare dimms we had available.
Its odd that the memory testing we had done, did not show up the issue and so we had looked at other causes.
I haven't seen anymore issues with our set up since that one time a few weeks back, I suspect it was because our cluster and resources weren't completely setup, we were adding network resources, nics, VMkernel networks, VLAN's, Firewall configs, etc etc, once we got it all squared away, we've not seen any issues
i also have the same issue yesterday in Dell M600 Blade with the following error. This is happens second time in two months. I logged a call to dell requested to generate a report using a tool. In report there is no error with regards to hardware.After rebooting the server everything is fine.
any update why it is happening?
OS: esx 3.5 updated3
Thanks in Advance
error_dvm5.JPG 90.5 K
Same issue here.
Latest version of ESXi, all bios updates applied. M600 blade, ESXi has locked up with PSOD on the two M600 blades we got running on this bladeserver.
Both blades run through dell diagnostics just fine, memtest86 et all.
Any idea what may be causing this?
Same problem as above, happening every week or so.
Anyone got a solution? Could it be a faulty CPU?
Not likely faulty CPU or hardware. More likely software bug or something that requires microcode-update for CPU .
It has happened across 4 different M600 bladservers. All running either E5410 or L5420 cpu's.
Similar problem used to occur on Supermicromachines running 5400 series CPU's, but was fixed by vendor through bios updates
Norman, thanks for your answer!
I'm actually running Supermicro X7DWU mobo with E5410 CPUs. I'm also on the latest BIOS for that mobo (11/4/2008).
I did read about the microcode issue and sent an email to Supermicro to get their feedback. Maybe the mobo I'm using was not updated with the new microcode yet. I'll let you guys know.
When you say software bug, you mean that one of the VMs on the server might be causing this?
Software bug = combination of VMWare being too picky to not workaround the problem. And the "hardware" feature causing the issue. Microcode disables that hardware feature.