VMware Cloud Community
Systemx3650M3LC
Contributor
Contributor

Random PSODs (Purple Screens) on IBM-Hardware

Hi community,

we are experiencing random PSODs ( every 3-6 weeks ) on our systems and are at loss what is causing this.

The hardware:

Two IBM System x3650 M3 (two &Gb SAS HBAs each) connected to a DS3524 storage (dual controller). The servers are running ESXi5. All systems and the ESXi are up to date. According to IBM the hardware is working fine.

PSODs happen randomly about every 3 to 6 weeks on both machines, but not simultaneously. Here is on of the PSODs:

psod_Server1_14.08_2012.jpg

The important lines seem to be:

Failed at vmkdrivers/src_9/vmklinux_9/vmuare/linux_scsi.c:2221 -- NOT REACHED

cpu10:4106)LinScsi: SCSILinuxCmdDone:2220:Attempted double completion

Maybe someone has some insight on this. Thanks in advance for any hints!

Reply
0 Kudos
3 Replies
sparrowangelste
Virtuoso
Virtuoso

all firmware is the latest?

not a solution but maybe make it single path?

seems like it might be connecting over both paths during load? have you been monitoring the io on the hba

--------------------- Sparrowangelstechnology : Vmware lover http://sparrowangelstechnology.blogspot.com
Reply
0 Kudos
riker82
Enthusiast
Enthusiast

Did you manage this IBM KB?

http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5086606

and also vmware one

http://kb.vmware.com/selfservice/microsites/microsite.do?cmd=displayKC&docType=kc&externalId=1030265...

It applies to the HBA but in my personal experience (IBM Bladecenter HS22 and HS22V) also to all the PCI devices

Reply
0 Kudos
Systemx3650M3LC
Contributor
Contributor

Thanks for the replies so far.

@sparrowangels:

Yes Hardware and Software are completely up to date. We actually had the same thought about the dual controller setup. If no other solution comes up we might try to make it single path just to see if that cures the problem.

@riker82:

We've also stumbled across those issues but have not applied them since we are not seeing the mentioned errors in the log.

Reply
0 Kudos