VMware Cloud Community
stephenrbarry
Enthusiast
Enthusiast
Jump to solution

EFI VMs are unable to boot

I have a strange issue.  On a new vSphere 6 environment, EFI VMs are unable to boot.  They get stuck at the Windows splash screen and never get past that point.  BIOS VMs on the same LUNs and hosts boot fine.

2016-01-06_11-44-14.png

I have a support ticket open with VMware support that's at the engineering level now, but thought I'd throw it out to the community as well since I've seen VMware developers on here in the past.

The ESXi build we're running is 3247720 (ESXi 6.0 Express Patch 4).  The hosts are connected via FC to an HP 3PAR array.

When the VM is powered on, the vmkernel spews FCP drop errors until the VM is powered off. We have worked with the storage vendor to ensure there are no errors occurring on the array or the FC switches, and as mentioned, other (BIOS) VMs are running and boot fine from the same LUNs.

Any help would be greatly appreciated!

Except from the vmkernel.log file during the VM boot:

2016-01-06T19:43:38.531Z cpu30:39135 opID=6060da5b)World: 15448: VC opID 68438541-00000303-1b-b8-9d-db-d9-01-f0-bf-83-e601 maps to vmkernel opID 6060da5b

2016-01-06T19:43:38.531Z cpu30:39135 opID=6060da5b)Config: 680: "SIOControlFlag2" = 1, Old Value: 0, (Status: 0x0)

2016-01-06T19:43:38.651Z cpu40:958701)World: vm 958702: 1647: Starting world vmm0:VM of type 8

2016-01-06T19:43:38.651Z cpu40:958701)Sched: vm 958702: 6499: Adding world 'vmm0:VM', group 'host/user', cpu: shares=-3 min=0 minLimit=-1 max=-1, mem: shares=-3 min=0 minLimit=-1 max=-1

2016-01-06T19:43:38.651Z cpu40:958701)Sched: vm 958702: 6514: renamed group 6355289 to vm.958701

2016-01-06T19:43:38.651Z cpu40:958701)Sched: vm 958702: 6531: group 6355289 is located under group 4

2016-01-06T19:43:38.653Z cpu40:958701)MemSched: vm 958701: 8109: extended swap to 57374 pgs

2016-01-06T19:43:38.726Z cpu40:958701)VSCSI: 4038: handle 8214(vscsi0:0):Creating Virtual Device for world 958702 (FSS handle 15563773) numBlocks=209715200 (bs=512)

2016-01-06T19:43:38.726Z cpu40:958701)VSCSI: 273: handle 8214(vscsi0:0):Input values: res=0 limit=-1 bw=-1 Shares=-1

2016-01-06T19:43:38.738Z cpu3:32842)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x9e (0x439e419aec80, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE

2016-01-06T19:43:38.743Z cpu3:32842)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x1a (0x439e419aec80, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE

2016-01-06T19:43:38.743Z cpu3:32842)ScsiDeviceIO: 2645: Cmd(0x439e419aec80) 0x1a, CmdSN 0x4140aa from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

2016-01-06T19:43:38.755Z cpu11:958702)VMMVMKCall: 235: Received INIT from world 958702

2016-01-06T19:43:38.756Z cpu11:958702)LSI: 1755: LSI: Initialized rings for scsi0 async=1, record=0 replay=0

2016-01-06T19:43:38.758Z cpu13:34906)Config: 680: "SIOControlFlag2" = 0, Old Value: 1, (Status: 0x0)

2016-01-06T19:43:39.335Z cpu26:958702)VSCSI: 2590: handle 8214(vscsi0:0):Reset request on FSS handle 15563773 (0 outstanding commands) from (vmm0:VM)

2016-01-06T19:43:39.335Z cpu43:33140)VSCSI: 2868: handle 8214(vscsi0:0):Reset [Retries: 0/0] from (vmm0:VM)

2016-01-06T19:43:39.336Z cpu43:33140)VSCSI: 2661: handle 8214(vscsi0:0):Completing reset (0 outstanding commands)

2016-01-06T19:43:39.336Z cpu26:958702)VSCSI: 2590: handle 8214(vscsi0:0):Reset request on FSS handle 15563773 (0 outstanding commands) from (vmm0:VM)

2016-01-06T19:43:39.336Z cpu43:33140)VSCSI: 2868: handle 8214(vscsi0:0):Reset [Retries: 0/0] from (vmm0:VM)

2016-01-06T19:43:39.336Z cpu43:33140)VSCSI: 2661: handle 8214(vscsi0:0):Completing reset (0 outstanding commands)

2016-01-06T19:43:40.389Z cpu5:33642)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x28 (0x439e58ce6340, 958702) to dev "naa.60002ac00000000000001cfb000153f2" on path "vmhba2:C0:T2:L7" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. Act:EVAL

2016-01-06T19:43:40.389Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:40.389Z cpu5:33642)ScsiDeviceIO: 2607: Cmd(0x439e58ce6340) 0x28, CmdSN 0x8d65f24c from world 958702 to dev "naa.60002ac00000000000001cfb000153f2" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0.

2016-01-06T19:43:40.578Z cpu5:33642)WARNING: iodm: vmk_IodmEvent:193: vmhba2: FRAME DROP event has been observed 6 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

2016-01-06T19:43:40.596Z cpu5:33642)NMP: nmp_ThrottleLogForDevice:3130: last error status from device naa.60002ac00000000000001cfb000153f2 repeated 10 times

2016-01-06T19:43:40.640Z cpu14:858673)WARNING: iodm: vmk_IodmEvent:193: vmhba1: FRAME DROP event has been observed 6 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

2016-01-06T19:43:40.810Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:41.022Z cpu14:33638)NMP: nmp_ThrottleLogForDevice:3130: last error status from device naa.60002ac00000000000001cfb000153f2 repeated 20 times

2016-01-06T19:43:41.806Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:41.872Z cpu14:33638)NMP: nmp_ThrottleLogForDevice:3130: last error status from device naa.60002ac00000000000001cfb000153f2 repeated 40 times

2016-01-06T19:43:42.796Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:43.562Z cpu14:33638)NMP: nmp_ThrottleLogForDevice:3130: last error status from device naa.60002ac00000000000001cfb000153f2 repeated 80 times

2016-01-06T19:43:43.795Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:44.795Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:45.810Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:46.805Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:46.951Z cpu14:33638)NMP: nmp_ThrottleLogForDevice:3130: last error status from device naa.60002ac00000000000001cfb000153f2 repeated 160 times

2016-01-06T19:43:47.795Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:48.811Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:49.797Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:50.812Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:51.005Z cpu14:33638)WARNING: iodm: vmk_IodmEvent:193: vmhba1: FRAME DROP event has been observed 252 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

2016-01-06T19:43:51.028Z cpu5:32827)WARNING: iodm: vmk_IodmEvent:193: vmhba2: FRAME DROP event has been observed 252 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

2016-01-06T19:43:51.807Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:52.798Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:53.712Z cpu14:33638)NMP: nmp_ThrottleLogForDevice:3130: last error status from device naa.60002ac00000000000001cfb000153f2 repeated 320 times

2016-01-06T19:43:53.816Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:54.807Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:55.803Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:56.810Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:57.807Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:58.340Z cpu28:32867)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x9e (0x43a65b99bf80, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE

2016-01-06T19:43:58.345Z cpu28:32867)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x1a (0x43a65b99bf80, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE

2016-01-06T19:43:58.345Z cpu28:32867)ScsiDeviceIO: 2645: Cmd(0x43a65b99bf80) 0x1a, CmdSN 0x41411e from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

2016-01-06T19:43:58.367Z cpu19:32858)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x9e (0x439e612ac300, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE

2016-01-06T19:43:58.372Z cpu19:32858)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x1a (0x439e612ac300, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE

2016-01-06T19:43:58.372Z cpu19:32858)ScsiDeviceIO: 2645: Cmd(0x439e612ac300) 0x1a, CmdSN 0x414124 from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

2016-01-06T19:43:58.621Z cpu22:32861)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x9e (0x439e585a6100, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE

2016-01-06T19:43:58.626Z cpu22:32861)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x1a (0x439e585a6100, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE

2016-01-06T19:43:58.626Z cpu22:32861)ScsiDeviceIO: 2645: Cmd(0x439e585a6100) 0x1a, CmdSN 0x41412a from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

2016-01-06T19:43:58.797Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:43:59.813Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:00.804Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:01.800Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:02.015Z cpu5:32829)WARNING: iodm: vmk_IodmEvent:193: vmhba2: FRAME DROP event has been observed 400 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

2016-01-06T19:44:02.095Z cpu14:32922)WARNING: iodm: vmk_IodmEvent:193: vmhba1: FRAME DROP event has been observed 400 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

2016-01-06T19:44:02.815Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:03.809Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:04.798Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:05.809Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:06.803Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:07.232Z cpu14:33638)NMP: nmp_ThrottleLogForDevice:3130: last error status from device naa.60002ac00000000000001cfb000153f2 repeated 640 times

2016-01-06T19:44:07.801Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:08.798Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:09.812Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:10.805Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:11.795Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:12.814Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:13.007Z cpu5:33642)WARNING: iodm: vmk_IodmEvent:193: vmhba2: FRAME DROP event has been observed 400 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

2016-01-06T19:44:13.088Z cpu14:33638)WARNING: iodm: vmk_IodmEvent:193: vmhba1: FRAME DROP event has been observed 400 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

2016-01-06T19:44:13.812Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:14.810Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:15.802Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:16.796Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:17.795Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:18.805Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:19.799Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:20.815Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:21.806Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:22.815Z cpu5:33642)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:23.811Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:24.002Z cpu5:33642)WARNING: iodm: vmk_IodmEvent:193: vmhba2: FRAME DROP event has been observed 400 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

2016-01-06T19:44:24.080Z cpu14:40215)WARNING: iodm: vmk_IodmEvent:193: vmhba1: FRAME DROP event has been observed 400 times in the last one minute. This suggests a problem with Fibre Channel link/switch!.

2016-01-06T19:44:24.795Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:25.807Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:26.801Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:27.796Z cpu14:33638)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60002ac00000000000001cfb000153f2" state in doubt; requested fast path state update...

2016-01-06T19:44:27.797Z cpu20:32859)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x9e (0x439e5c5d8200, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE

2016-01-06T19:44:27.803Z cpu20:32859)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x1a (0x439e5c5d8200, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE

2016-01-06T19:44:27.803Z cpu20:32859)ScsiDeviceIO: 2645: Cmd(0x439e5c5d8200) 0x1a, CmdSN 0x4141ee from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

2016-01-06T19:44:27.837Z cpu0:958701)VSCSI: 6726: handle 8214(vscsi0:0):Destroying Device for world 958702 (pendCom 1)

2016-01-06T19:44:27.837Z cpu0:958701)WARNING: VSCSI: vm 958702: 6730: closing handle 8214 with 1 pending cmds, 1 ref count

2016-01-06T19:44:27.837Z cpu0:958701)VSCSI: 2590: handle 8214(vscsi0:0):Reset request on FSS handle 15563773 (1 outstanding commands) from (vmm0:VM)

2016-01-06T19:44:27.837Z cpu43:33140)VSCSI: 2868: handle 8214(vscsi0:0):Reset [Retries: 0/0] from (vmm0:VM)

2016-01-06T19:44:27.838Z cpu14:32922)lpfc: lpfc_scsi_cmd_iocb_cmpl:2154: 0:(0):3271: FCP cmd x28 failed <0/7> sid x011500, did x010000, oxid x252 iotag x558 Abort Requested Host Abort Req

2016-01-06T19:44:27.838Z cpu14:33638)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x28 (0x439e58ce6340, 958702) to dev "naa.60002ac00000000000001cfb000153f2" on path "vmhba1:C0:T0:L7" Failed: H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. Act:EVAL

2016-01-06T19:44:27.838Z cpu14:33638)ScsiDeviceIO: 2589: Cmd(0x439e58ce6340) 0x28, CmdSN 0x8d65f24c from world 958702 to dev "naa.60002ac00000000000001cfb000153f2" failed H:0x8 D:0x0 P:0x0

2016-01-06T19:44:27.838Z cpu43:33140)VSCSI: 2661: handle 8214(vscsi0:0):Completing reset (0 outstanding commands)

2016-01-06T19:44:30.073Z cpu2:33459)lpfc: lpfc_rportStats:4205: 0:(0) Compression log for fcp target 0, path is ok, TMGMT: abts=1, tgt_rst=0, lun_rst=6

2016-01-06T19:44:30.073Z cpu2:33459)lpfc: lpfc_rportStats:4241: 0:(0) Compression log for fcp target 0, path is ok, FRAME: drops=1275, under=0, over=0

2016-01-06T19:44:30.073Z cpu2:33459)lpfc: lpfc_rportStats:4241: 0:(0) Compression log for fcp target 1, path is ok, FRAME: drops=1274, under=0, over=0

2016-01-06T19:44:30.650Z cpu2:33459)lpfc: lpfc_rportStats:4241: 1:(0) Compression log for fcp target 0, path is ok, FRAME: drops=1278, under=0, over=0

2016-01-06T19:44:30.650Z cpu2:33459)lpfc: lpfc_rportStats:4241: 1:(0) Compression log for fcp target 2, path is ok, FRAME: drops=1277, under=0, over=0

Reply
0 Kudos
1 Solution

Accepted Solutions
stephenrbarry
Enthusiast
Enthusiast
Jump to solution

Just to officially close the loop on this, it turned out to be a bug with 16 Gb HBA's on 3PAR arrays.  HPE fixed the bug with 3.2.2 EMU2, which updates the firmware of the HBA's.

In the meantime, we were able to use the disk.maxIOsize workaround.

View solution in original post

Reply
0 Kudos
16 Replies
dariusd
VMware Employee
VMware Employee
Jump to solution

My guess is that something in your FC storage stack is failing to handle large I/Os.

Most BIOS code will only issue small (usually 64 kByte or less) requests for each read/write operation, but EFI can issue large requests (easily 16 MBytes, sometimes larger), and we've seen some storage devices or drivers which do not handle such a large request correctly.  In my experience, this is often the cause of EFI VMs failing to boot where "equivalent" BIOS VMs boot correctly – and the storage log messages you're seeing on the host make it very likely to be the cause in this case.

Hopefully your support ticket is currently with our storage engineers; They'd be in the best position to figure out the root cause of the underlying storage issue and determine whether there is anything else going wrong.

Cheers,

--

Darius

Reply
0 Kudos
stephenrbarry
Enthusiast
Enthusiast
Jump to solution

The ticket is with the storage engineers, just waiting on a response from them at this point.  Their initial thought was very much along the same lines as yours, and pointed to this KB article: VMware KB: Booting Windows virtual machine using EFI from EMC XtremIO storage fails with error: Boot...

Unfortunately the same workaround did not work in our environment. 

Now that you mention it, this is the first environment of ours to take advantage of Emulex 16 Gb FC HBAs; everywhere else up to this point had been using older 8 Gb.  Beyond that everything else is pretty standard. I wonder if there's a difference with the driver that's causing the issue.  Very curious to find out what the root cause is, guess I'll just have to be patient Smiley Happy  Thanks for the additional info!

Steve

Reply
0 Kudos
dariusd
VMware Employee
VMware Employee
Jump to solution

I'm also curious to find out what the root cause is.  Software, drivers and hardware can interact in surprising ways...

Did you try setting an even-lower value for DiskMaxIOSize?  Setting it to lower values than given in that Knowledge Base article (such as 1024 or even 256) can reduce storage performance, but might help with compatibility – and might also provide a great data point for troubleshooting.

My focus is on EFI firmware, though, and I don't have much knowledge of the ESXi storage system, so our storage engineers are much more likely to know what they are doing there.  If you're making good progress through your support ticket, you can safely ignore me.  :smileylaugh:

Cheers,

--

Darius

Reply
0 Kudos
stephenrbarry
Enthusiast
Enthusiast
Jump to solution

I only tried the 4096 value for DiskMaxIOSize, I'll try lower values tomorrow just to see what happens.

So far the ticket is still with engineering, they've started an evaluation now to determine the cause.  I'll post any findings here.

Steve

Reply
0 Kudos
dariusd
VMware Employee
VMware Employee
Jump to solution

OK, cool.  One important request: Please make sure that you only change one thing at a time... For instance, if you change DiskMaxIOSize as an experiment (because I asked), and then support asks you to try something else, make sure you change DiskMaxIOSize back again before proceeding with the support engineer's instructions.  I want to make sure that my little experiment doesn't create additional work or confusion on your support ticket!

Cheers,

--

Darius

Reply
0 Kudos
dstamen
Hot Shot
Hot Shot
Jump to solution

I also have a case currently open on this. However my stuff is a little different. When i boot an EFI Bios VM via FC it goes straight to automatic repair. If i move the VM to local, iSCSI or NFS it boots fine...my case is also still with engineering and log review.

Reply
0 Kudos
stephenrbarry
Enthusiast
Enthusiast
Jump to solution

Yeah, your issue sounds more like the KB article where the VM can't read the boot information.  For me, the host just reports FC frame drops during boot so it's very similar, but slightly different.  Glad to hear the Disk.MaxIOSize worked for you!

Reply
0 Kudos
stephenrbarry
Enthusiast
Enthusiast
Jump to solution

Understood, thanks for the advice Smiley Happy

Reply
0 Kudos
stephenrbarry
Enthusiast
Enthusiast
Jump to solution

FYI, this is still an issue.  We just stood up a new datacenter with exact same hardware specs and we're seeing the same behavior.  So far VMware Engineering's response has been "we're not seeing it, do more tests and send more logs."  Hoping to get to a resolution soon.

Reply
0 Kudos
Finikiez
Champion
Champion
Jump to solution

Can you try set disk.maxiosize to 2048 value?

stephenrbarry
Enthusiast
Enthusiast
Jump to solution

Funny, that is exactly what VMware support told me this morning.  I'm going to attempt it right now.

Reply
0 Kudos
stephenrbarry
Enthusiast
Enthusiast
Jump to solution

So far I am having 100% success rate with setting the dismaxIOsize to 2048.  I'd still love to know core too cause and I will continue working with VMware support, but this will at least allow us to proceed with our datacenter deployments.  Thank you!

Reply
0 Kudos
stephenrbarry
Enthusiast
Enthusiast
Jump to solution

After more testing, I found that in one of our datacenters, we were able to configure the disk.maxIOsize setting to 4096 and VMs booted.  In the other datacenter, I did a lot of testing and found that 4080 is the magic number.  At 4080, VMs boot.  At 4081, they fail.

Reply
0 Kudos
Finikiez
Champion
Champion
Jump to solution

I guess you have different firmware versions on your 3PARs, this is why you have different results.

As it was mentioned earlier, EFI generates IO with big block size on booting and it's not correctly handled on storage side.

The workaround is to split IO on host size so storage array can work with it properly.

Reply
0 Kudos
stephenrbarry
Enthusiast
Enthusiast
Jump to solution

Just to officially close the loop on this, it turned out to be a bug with 16 Gb HBA's on 3PAR arrays.  HPE fixed the bug with 3.2.2 EMU2, which updates the firmware of the HBA's.

In the meantime, we were able to use the disk.maxIOsize workaround.

Reply
0 Kudos