VMware Cloud Community
Nouha
Contributor
Contributor

Exception 14 @ pink screen

Hi ,

i have an 2 * IBM X3650 M4 servers with ESXi 5.5 OS both of them are connected to an HP NAS through iSCSI

One of these servers crashes periodically and displays a pink screen with exception 14 ,

IBM told me the following :

______________________________________________________________

you must add the following lines to the file /etc/multipath.confles

****************************************                                                                                     

blacklist {                                                   

       device {                                              

               vendor  "IBM"                                 

               product "ServeRAID M5110e"                    

       }                                                     

}                                                             

****************************************

________________________________________________________

but no success the problem persists . in the addition to the fact the the problem exists only on one server.

pastedImage_0.png

I have these lines in vmkernel log ,

2016-06-20T07:42:07.808Z cpu17:33481)VMK_PCI: 395: Device 0000:16:00.0 name: vmhba0

2016-06-20T07:42:07.808Z cpu17:33481)DMA: 612: DMA Engine 'vmhba0' created using mapper 'DMANull'.

2016-06-20T07:42:07.808Z cpu17:33481)ScsiScan: 976: Path 'vmhba0:C0:T0:L0': Vendor: 'IBM     '  Model: 'ServeRAID M5110e'  Rev: '3.19'

2016-06-20T07:42:07.808Z cpu17:33481)ScsiScan: 979: Path 'vmhba0:C0:T0:L0': Type: 0x0, ANSI rev: 5, TPGS: 0 (none)

2016-06-20T07:42:07.808Z cpu17:33481)megasas_slave_configure: do not export physical disk devices to upper layer.

2016-06-20T07:42:07.808Z cpu17:33481)WARNING: ScsiScan: 1408: Failed to add path vmhba0:C0:T0:L0 : Not found

2016-06-20T07:42:07.821Z cpu17:33481)ScsiScan: 976: Path 'vmhba0:C2:T0:L0': Vendor: 'IBM     '  Model: 'ServeRAID M5110e'  Rev: '3.19'

2016-06-20T07:42:07.821Z cpu17:33481)ScsiScan: 979: Path 'vmhba0:C2:T0:L0': Type: 0x0, ANSI rev: 5, TPGS: 0 (none)

2016-06-20T07:42:07.821Z cpu17:33481)ScsiScan: 1503: Add path: vmhba0:C2:T0:L0

2016-06-20T07:42:07.864Z cpu7:33421)<6>igb: vmnic3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

2016-06-20T07:42:07.888Z cpu18:33422)<6>igb: vmnic2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

2016-06-20T07:42:08.008Z cpu17:33481)PCI: driver megaraid_sas claimed device 0000:16:00.0

2016-06-20T07:42:08.008Z cpu17:33481)PCI: driver megaraid_sas claimed 1 device

2016-06-20T07:42:08.008Z cpu17:33481)ScsiNpiv: 1510: GetInfo for adapter vmhba0, [0x4108fa854bc0], max_vports=0, vports_inuse=0, linktype=0, state=0, failreason=0, sts=bad0020

2016-06-20T07:42:08.008Z cpu17:33481)Mod: 4780: Initialization of megaraid_sas succeeded with module ID 4146.

2016-06-20T07:42:08.008Z cpu17:33481)megaraid_sas loaded successfully.

2016-06-20T07:42:08.039Z cpu22:33482)Loading module ahci ...

2016-06-20T07:42:08.039Z cpu22:33482)Elf: 1861: module ahci has license GPL

2016-06-20T07:42:08.040Z cpu22:33482)module heap: Initial heap size: 1048576, max heap size: 9756672

2016-06-20T07:42:08.040Z cpu22:33482)vmklnx_module_mempool_init: Mempool max 9756672 being used for module: 4147

2016-06-20T07:42:08.040Z cpu22:33482)vmk_MemPoolCreate passed for 256 pages

2016-06-20T07:42:08.040Z cpu22:33482)module heap: using memType 2

2016-06-20T07:42:08.040Z cpu22:33482)module heap vmklnx_ahci: creation succeeded. id = 0x4109f87e1000

2016-06-20T07:42:08.040Z cpu22:33482)PCI: driver ahci is looking for devices

2016-06-20T07:42:08.040Z cpu22:33482)<7>ahci 0000:00:1f.2: version 3.0-17vmw

2016-06-20T07:42:08.040Z cpu22:33482)DMA: 612: DMA Engine 'vmklnxpci-0:0:31.2' created using mapper 'DMANull'.

2016-06-20T07:42:08.040Z cpu22:33482)DMA: 612: DMA Engine 'vmklnxpci-0:0:31.2' created using mapper 'DMANull'.

2016-06-20T07:42:08.040Z cpu22:33482)DMA: 612: DMA Engine 'vmklnxpci-0:0:31.2' created using mapper 'DMANull'.

2016-06-20T07:42:08.040Z cpu22:33482)DMA: 657: DMA Engine 'vmklnxpci-0:0:31.2' destroyed.

2016-06-20T07:42:08.047Z cpu16:32852)NetPort: 1589: disabled port 0x6

2016-06-20T07:42:08.047Z cpu16:32852)Uplink: 6530: enabled port 0x6 with mac 6e:ae:8b:3b:9d:19

2016-06-20T07:42:08.047Z cpu16:32852)NetPort: 1589: disabled port 0x3

2016-06-20T07:42:08.047Z cpu16:32852)Uplink: 6530: enabled port 0x3 with mac 6c:ae:8b:3b:9d:1b

2016-06-20T07:42:09.047Z cpu16:32852)NetPort: 1589: disabled port 0x5

2016-06-20T07:42:09.047Z cpu16:32852)Uplink: 6530: enabled port 0x5 with mac 6c:ae:8b:3b:9d:1d

2016-06-20T07:42:09.047Z cpu16:32852)NetPort: 1589: disabled port 0x4

2016-06-20T07:42:09.047Z cpu16:32852)Uplink: 6530: enabled port 0x4 with mac 6c:ae:8b:3b:9d:1c

2016-06-20T07:42:09.047Z cpu16:32852)NetPort: 1589: disabled port 0x2

2016-06-20T07:42:09.048Z cpu16:32852)Uplink: 6530: enabled port 0x2 with mac 6c:ae:8b:3b:9d:1a

2016-06-20T07:42:09.051Z cpu22:33482)<6>ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 1.5 Gbps 0x2 impl SATA mode

2016-06-20T07:42:09.051Z cpu22:33482)<6>ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio slum part

2016-06-20T07:42:09.051Z cpu22:33482)IRQ: 540: 0x39 <ahci> sharable, flags 0x10

2016-06-20T07:42:09.051Z cpu22:33482)VMK_VECTOR: 218: Registered handler for shared interrupt 0xff39, flags 0x10

2016-06-20T07:42:09.053Z cpu22:33482)LinPCI: LinuxPCI_DeviceIsPAECapable:602: PAE capable device at 0000:00:1f.2

2016-06-20T07:42:09.053Z cpu22:33482)VMK_PCI: 395: Device 0000:00:1f.2 name: vmhba1

2016-06-20T07:42:09.053Z cpu22:33482)DMA: 612: DMA Engine 'vmhba1' created using mapper 'DMANull'.

2016-06-20T07:42:09.054Z cpu22:33482)LinPCI: LinuxPCI_DeviceIsPAECapable:602: PAE capable device at 0000:00:1f.2

2016-06-20T07:42:09.054Z cpu22:33482)VMK_PCI: 395: Device 0000:00:1f.2 name: vmhba1

2016-06-20T07:42:09.054Z cpu22:33482)DMA: 612: DMA Engine 'vmhba1' created using mapper 'DMANull'.

2016-06-20T07:42:09.054Z cpu22:33482)DMA: 657: DMA Engine 'vmhba1' destroyed.

2016-06-20T07:42:09.054Z cpu22:33482)DMA: 612: DMA Engine 'vmhba32' created using mapper 'DMANull'.

Do these lines mean that i have  a problem in Raid controller??

Best Regards,

5 Replies
rcporto
Leadership
Leadership

Have you already applied the latest firmware version to your server ? And another option is try to upgrade the MegaRAID SAS driver, like described here: ESXi 5.x host fails with a purple diagnostic screen when using LSI MegaRAID SAS Driver (2052368) | V...

---

Richardson Porto
Senior Infrastructure Specialist
LinkedIn: http://linkedin.com/in/richardsonporto
0 Kudos
Nouha
Contributor
Contributor

The firmware is up-to-date but the firmaware i'm not sure i will check and try and come back to you , thank you very much

0 Kudos
Nouha
Contributor
Contributor

hi ,

i've upgraded the mega sas driver ,but i've received a pink screen again Smiley Sad

sorry for the delay of the feedback , client is not always available , the problem that this client has two servers with the same characteristics , but the second one, has no problem .

Best Regards ,

0 Kudos
virtualg_uk
Leadership
Leadership

I would also patch the ESXI host to the latest version or latest patch level for 5.5.

Generally these exception 14's are hardware related so it would also be worth running a hardware diagnostics via tools available from the vendor.

I hope this helps.


Graham | User Moderator | https://virtualg.uk
Nouha
Contributor
Contributor

Thank you for response, actually i looks like a hardware problem , fortunately the server still under warranty so we've contacted IBM , and it look like a RAID card problem , but IBM still didn't confess they still upgrading firmwars  and stuff and the problem continues until know  ,

Thanks again for your help.

Best Regards ,