Hi all,
I've an issue concerning a freshly upgraded ESXi 5.5 from 5.1 version. It was running with no problem since the upgrade. But yet, it gets a purple screen sometimes (twice pas month).
It's running on a HP ML350 G6 server. This was upgraded using HP sources.
Here is what I get in dump file:
[7m2014-03-16T07:35:52.451Z cpu2:2544604)WARNING: LinDMA: dma_alloc_coherent:726: Out of memory [0m
2014-03-16T07:35:52.451Z cpu2:2544604)<4>hpsa 0000:0e:00.0: cmd_special_alloc returned NULL!
[7m2014-03-16T07:35:52.451Z cpu2:2544604)WARNING: LinDMA: dma_alloc_coherent:726: Out of memory [0m
2014-03-16T07:35:52.451Z cpu2:2544604)<3>hpsa 0000:0e:00.0: cmd_special_alloc returned NULL!
2014-03-16T07:35:52.451Z cpu2:2544604)<3>hpsa1: set_sas_ids: report extended physical LUNs failed.
2014-03-16T07:35:52.455Z cpu5:2544603)<4>hpsa 0000:04:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
2014-03-16T07:36:11.210Z cpu3:37604)<4>hpsa 0000:0e:00.0: cp 0x410970db3280 has status 0x2 Sense: 0x2, ASC: 0x3a, ASCQ: 0x0, Returning result: 0x2
2014-03-16T07:36:11.212Z cpu4:32793)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412e807be880, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2014-03-16T07:36:11.212Z cpu4:32793)ScsiDeviceIO: 2337: Cmd(0x412e807be880) 0x1a, CmdSN 0x5d60 from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
2014-03-16T07:36:11.219Z cpu5:33299)<4>hpsa 0000:04:00.0: cp 0x410970d91500 has status 0x2 Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2
2014-03-16T07:36:11.219Z cpu5:33299)<4>hpsa 0000:04:00.0: cp 0x410970d91000 has status 0x2 Sense: 0x5, ASC: 0x24, ASCQ: 0x0, Returning result: 0x2
[7m2014-03-16T07:36:22.453Z cpu3:2544693)WARNING: LinDMA: dma_alloc_coherent:726: Out of memory [0m
2014-03-16T07:36:22.453Z cpu3:2544693)<4>hpsa 0000:0e:00.0: cmd_special_alloc returned NULL!
[7m2014-03-16T07:36:22.453Z cpu3:2544693)WARNING: LinDMA: dma_alloc_coherent:726: Out of memory [0m
2014-03-16T07:36:22.453Z cpu3:2544693)<3>hpsa 0000:0e:00.0: cmd_special_alloc returned NULL!
2014-03-16T07:36:22.453Z cpu3:2544693)<3>hpsa1: set_sas_ids: report extended physical LUNs failed.
2014-03-16T07:36:22.458Z cpu3:2544692)<4>hpsa 0000:04:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
The internal SCSI controller firmware is up to date. There is no cluster and no vCenter installed/configured, it is running an essential license in standalone operation config.
Has anyone any idea please ?
Sincerly
Martin
Same here - exactly same software HP ML350 G6 - have a purple screen already twice for last 11 days
PSCPU 1 Locked up. Failed to ack TLB invalidate.
All this after upgrade from 4.3 to 5.5 and started to work with iSCSI connected storage.
Please advice ! ?
Could you please extract the core dump for PSOD available in var/core using the command esxcfg-dumppart -L <vmkernel-zdump filename > and attacth the vmkernel-log file
Details steps are available in VMware KB http://kb.vmware.com/kb/1006796
Is the BIOS of the server up to date ?
HI thanks for reply, I`ve failed to extract the log however I've captured the purple screen with my camera - i`ll upload it here.
This is the last one:
I h
And this is 12 day ago:
Hi,
After a discussion and offline diag with HP, my HP contact gave me those links.
Update HP firmware :
HP Service Pack for ProLiant:
http://h18004.www1.hp.com/products/servers/management/spp/index.html
Update Bundle :
* RECOMMENDED * HP ESXi Offline Bundle for VMware vSphere 5.5
* RECOMMENDED * HP ESXi Utilities Offline Bundle for VMware vSphere 5.5
It seems that the BIOS ver D22 has multiple version named axactly the same D22 .... I will try installing those bundles and updates and give a feddback in two or three weeks if it was successful or not.
Thx for your replies
Martin Pasquier
hi
same here ML350 G6. I have 3 of them and 2 of them has exactly that error.
Were the HP MGM Software update solve your problem?
If yes did your hp support contact saye something about other HW witch is involved in that error?
So i.E. DL380G8 ?
Thanks for your feedback.
Cheers
Florian
BTW: In my case it was a new installation with the HP Costomized ISO and a completle u2d System. (Install date around Mar)
Hi Sanktuary,
Yes, those bundles and firmware upgrades solved my problem ! It's running like a champs since my last message without any problem \o/ !
HP didn't inform me about issue on other hardware.
What I found strange is that I already was running the D22 BIOS, but the D22 given from HP contact had different size, so I installed it and the new one seems to be a new version of D22 ...
I also upgraded SAS backbone firmware and installed new ESXi bundles to drive the SAS hardware. It's just running fine now. So maybe, try to find bundles AND/OR firmware upgrades for your DL380G8
Hope I helped you
Regards
Martin
Hi mpasquer
Thanks for your feedback.
Hm i saw that the link witch the spp is not up2date but i downloadet it right now.
I will update (maybe downgrade) my esx tomorrow erarly and after that i will update the offline bundel stuff.
Im hopping to solve the problem with that
Tanks again
Florian
hello,
there is issue with the hpsa driver causing PSOD and out of memory issues, you might want to have look at:
br/S
Hm Finaly i got my problem also after the bios upgrade.
In my case it was a power setting in the bios. Cause vmware has with 5.5 per default an option on witch is corresponding to Colaborative Power Control.
Under ML350G6 there is this iption on and hase Problems with the default setting in vmware.
So HP told me to perform the steps below (boot intobios)
> Press F9 to enter setup.
> Select Power Management Options.
> Select Advanced Power Management Options.
> Select Collaborative Power Control.
> Select disabled.
Bonjour,
Je serai absent jusqu'au 14.07.2014. Pour toute demande urgente vous pouvez vous adresser au service desk au 0800 111 100.
Meilleures salutations
Martin Pasquier