VMware Cloud Community
ChrisGurley
Enthusiast
Enthusiast

EFI firmware VMs hang on reboot on XtremIO storage

We had an issue a while back that was addressed in this thread (Re: EMC XtremIO Gen2 array, vSphere 5.5, and Windows Server 2012 R2 EFI), but we seem to be experiencing a mild recurrence of it, in spite of the workaround being in place (Disk.DiskMaxIOSize = 4096). As before, our guests are Windows Server 2012 R2 with EFI as the firmware, our hosts are Dell PowerEdge R820's w/ QLogic 8262 CNAs, and our storage is XtremIO on 2.4.0 code.

What we've seen is that when we reboot EFI-based VMs (one by one via the vSphere Client or from within in the guest during automated Microsoft patch windows), the VM shuts down, remains "Powered On", but never boots again. The following syslog events from one VM stand out from the incident last night. I don't know how to interpret them, so they may not shed any light.

2014-11-16 21:07:02Local6.Infoesx032014-11-17T03:07:02.384Z esx03.domain.com vmkernel: cpu32:37031)VSCSI: 2447: handle 13363(vscsi0:0):Reset request on FSS handle 385159156 (0 outstanding commands) from (vmm0:SQL02)
2014-11-16 21:07:02Local6.Infoesx032014-11-17T03:07:02.384Z esx03.domain.com vmkernel: cpu32:37031)VSCSI: 2447: handle 13364(vscsi0:1):Reset request on FSS handle 314970103 (0 outstanding commands) from (vmm0:SQL02)
2014-11-16 21:07:02Local6.Infoesx032014-11-17T03:07:02.384Z esx03.domain.com vmkernel: cpu32:37031)VSCSI: 2447: handle 13365(vscsi0:2):Reset request on FSS handle 264376314 (0 outstanding commands) from (vmm0:SQL02)
2014-11-16 21:07:02Local6.Infoesx032014-11-17T03:07:02.384Z esx03.domain.com vmkernel: cpu8:33084)VSCSI: 2728: handle 13363(vscsi0:0):Reset [Retries: 0/0] from (vmm0:SQL02)
2014-11-16 21:07:02Local6.Infoesx032014-11-17T03:07:02.384Z esx03.domain.com vmkernel: cpu8:33084)VSCSI: 2521: handle 13363(vscsi0:0):Completing reset (0 outstanding commands)
2014-11-16 21:07:02Local6.Infoesx032014-11-17T03:07:02.384Z esx03.domain.com vmkernel: cpu8:33084)VSCSI: 2728: handle 13364(vscsi0:1):Reset [Retries: 0/0] from (vmm0:SQL02)
2014-11-16 21:07:02Local6.Infoesx032014-11-17T03:07:02.384Z esx03.domain.com vmkernel: cpu8:33084)VSCSI: 2521: handle 13364(vscsi0:1):Completing reset (0 outstanding commands)
2014-11-16 21:07:02Local6.Infoesx032014-11-17T03:07:02.384Z esx03.domain.com vmkernel: cpu8:33084)VSCSI: 2728: handle 13365(vscsi0:2):Reset [Retries: 0/0] from (vmm0:SQL02)
2014-11-16 21:07:02Local6.Infoesx03

2014-11-17T03:07:02.385Z esx03.domain.com vmkernel: cpu8:33084)VSCSI: 2521: handle 13365(vscsi0:2):Completing reset (0 outstanding commands)

And...

2014-11-16 21:07:04Local4.Infoesx032014-11-17T03:07:04.245Z esx03.domain.com Hostd: [2A580B70 info 'Vmsvc.vm:/vmfs/volumes/53598833-2666e83f-cf4f-b8ca3a6a9c64/SQL02/SQL02.vmx'] Turning off heartbeat checker
2014-11-16 21:07:04Local4.Infoesx032014-11-17T03:07:04.247Z esx03.domain.com Fdm: [FFF18B70 verbose 'Invt' opID=SWI-310ea49d] [VmHeartbeatStateChange::SaveToInventory] vm /vmfs/volumes/53598833-2666e83f-cf4f-b8ca3a6a9c64/SQL02/SQL02.vmx changed  guestHB=red
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu40:37029)VSCSI: 2447: handle 13363(vscsi0:0):Reset request on FSS handle 385159156 (0 outstanding commands) from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu40:37029)VSCSI: 2447: handle 13364(vscsi0:1):Reset request on FSS handle 314970103 (0 outstanding commands) from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2728: handle 13363(vscsi0:0):Reset [Retries: 0/0] from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu40:37029)VSCSI: 2447: handle 13365(vscsi0:2):Reset request on FSS handle 264376314 (0 outstanding commands) from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2521: handle 13363(vscsi0:0):Completing reset (0 outstanding commands)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2728: handle 13364(vscsi0:1):Reset [Retries: 0/0] from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2521: handle 13364(vscsi0:1):Completing reset (0 outstanding commands)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2728: handle 13365(vscsi0:2):Reset [Retries: 0/0] from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2521: handle 13365(vscsi0:2):Completing reset (0 outstanding commands)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu40:37029)VSCSI: 2447: handle 13363(vscsi0:0):Reset request on FSS handle 385159156 (0 outstanding commands) from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu40:37029)VSCSI: 2447: handle 13364(vscsi0:1):Reset request on FSS handle 314970103 (0 outstanding commands) from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2728: handle 13363(vscsi0:0):Reset [Retries: 0/0] from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu40:37029)VSCSI: 2447: handle 13365(vscsi0:2):Reset request on FSS handle 264376314 (0 outstanding commands) from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2521: handle 13363(vscsi0:0):Completing reset (0 outstanding commands)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2728: handle 13364(vscsi0:1):Reset [Retries: 0/0] from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2521: handle 13364(vscsi0:1):Completing reset (0 outstanding commands)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2728: handle 13365(vscsi0:2):Reset [Retries: 0/0] from (vmm0:SQL02)
2014-11-16 21:07:04Local6.Infoesx032014-11-17T03:07:04.448Z esx03.domain.com vmkernel: cpu0:33084)VSCSI: 2521: handle 13365(vscsi0:2):Completing reset (0 outstanding commands)

Maybe it isn't EFI related, but those are the only VMs that evidence this issue, and only on our XtremIO array. Other non-EFI VMs on XtremIO (re)booted fine, and all VMs, both EFI and BIOS, on our HP 3PAR array (re)booted without any issues.

I'll create a case in parallel, but wanted to drop these logs out here in case the community has seen anything similar. Thanks!

--Chris

Tags (3)
0 Kudos
0 Replies