VMware Cloud Community
muser12
Contributor
Contributor

lost access to volume

I am running ESXi on 3 different machines at high load (cpu + disk), and I am encountering the following error event:

Lost access to volume GUID (XXX) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly. 3/21/2015 3:48:57 PM

shortly thereafter, access is restored:

Successfully restored access to volume  GUID (XXX) following  connectivity issues. 3/21/2015 3:49:24 PM

Interestingly, these events occur exactly every 6 hours on each affected machine:

2015-03-21T13:39:01.303Z cpu30:32857)HBX: 270: Reclaimed heartbeat for volume GUID (XXX): [Timeout] Offset 3796992

2015-03-21T19:39:13.824Z cpu20:32855)HBX: 270: Reclaimed heartbeat for volume GUID (XXX): [Timeout] Offset 3796992

2015-03-22T01:39:09.569Z cpu0:32856)HBX: 270: Reclaimed heartbeat for volume GUID (XXX): [Timeout] Offset 3796992

Most of the search results (and the VMware KB) discuss issues related to FC/iSCSI/network connected datastores. However, these datastores are local disks connected to a MegaRAID SAS controller.

The fact that this occurs every 6 hours made me think there was some sort of cronjob or the like that was running and causing a whole bunch of disk churn, which, combined with an already high disk load, was clogging up the controller. However, I can't find any such cronjob in /var/spool/cron/crontabs/root. I've checked several logs in /var/log, and nothing is jumping out anywhere around those time frames. I've updated to the latest ESXi patches, but that didn't help. Any ideas?

FWIW, my hw/sw is:

ESXi 5.5.0, 2456374

SuperMicro X10-DRHC

MegaRaid SAS Invader Controller (LSI 3108 SAS3)

3 consumer SSD drives in RAID0 (for the affected datastore)

24 Replies
cykVM
Expert
Expert

Maybe it helps to update the BIOS of your motherboard. Follwong the "Update your BIOS" link on Supermicro | Products | Motherboards | Xeon® Boards | X10DRH-C … there is a BIOS version as of Feb 2015 (X10DRH5_212.zip - R 1.0c)

Reply
0 Kudos
muser12
Contributor
Contributor

I don't think that's going to do it, but I will update the BIOS on one of the machines to test this.

Reply
0 Kudos
muser12
Contributor
Contributor

BIOS update had no effect, any other ideas?

Reply
0 Kudos
vickey0rana
Enthusiast
Enthusiast

Based on the logging pattern and as this issue occurs for all 3 hosts, where the datastore is attached. You should also check storage performance to isolate this issue.

So on next occurrence make a note to check what is the CPU usage on Storage SP connected to the LUN. If CPU of storage gives you spike at that time thn we need to think differently.

---------------------------------------------------------------- If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) BR, Ravinder S Rana
Reply
0 Kudos
muser12
Contributor
Contributor

Like I said in my first post, this is not a network connected datastore. Each host has its own datastore consisting of 3 SATA disks connected to the on mobo MegaRAID controller.

Reply
0 Kudos
brunofernandez1

can you post the logs from:

/var/log/syslog.log

/var/log/vmkernel.log

------------------------------------------------------------------------------- If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards from Switzerland, B. Fernandez http://vpxa.info/
Reply
0 Kudos
virtualworld199
Contributor
Contributor

Check the UUID for these 3 consumer SSD drives in RAID0 if they have same UUID configured change them accordingly, there will be issues when you have same GUID for multiple data stores connected to same host. can you post the logs  from /var/log/vmkernel. That will give us more brighter and closer look at the issue . It could be the SCSI Reservation conflict or the the queue depth issues. Please let me know and if you have any other question please post it.

Reply
0 Kudos
muser12
Contributor
Contributor

The datastore GUIDs are unique. Here is a snippet of vmkernel log right around when the issue occurs. You can see that around 5 minutes before the 'Reclaimed heartbeat' line, various SCSI commands start failing. The question is, why is this always happening on a 6 hour cycle?

2015-03-24T19:26:41.635Z cpu12:33565)NMP: nmp_ThrottleLogForDevice:2322: Cmd 0x85 (0x413682af5580, 34776) to dev "naa.6003048016ba11001c38266a07a2ea99" on path "vmhba2:C2:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE

2015-03-24T19:26:41.635Z cpu12:33565)ScsiDeviceIO: 2338: Cmd(0x413682af5580) 0x85, CmdSN 0x219 from world 34776 to dev "naa.6003048016ba11001c38266a07a2ea99" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.

2015-03-24T19:26:41.635Z cpu12:33565)ScsiDeviceIO: 2338: Cmd(0x413682af5580) 0x4d, CmdSN 0x21a from world 34776 to dev "naa.6003048016ba11001c38266a07a2ea99" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.

2015-03-24T19:26:41.635Z cpu12:33565)ScsiDeviceIO: 2338: Cmd(0x413682af5580) 0x1a, CmdSN 0x21b from world 34776 to dev "naa.6003048016ba11001c38266a07a2ea99" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

2015-03-24T19:27:00.006Z cpu12:34156)World: 14302: VC opID hostd-0f4f maps to vmkernel opID 5e9ce35c

2015-03-24T19:27:20.005Z cpu24:34363)World: 14302: VC opID hostd-0f4f maps to vmkernel opID 5e9ce35c

2015-03-24T19:27:40.004Z cpu0:34156)World: 14302: VC opID hostd-0f4f maps to vmkernel opID 5e9ce35c

2015-03-24T19:28:00.005Z cpu8:34156)World: 14302: VC opID hostd-0f4f maps to vmkernel opID 5e9ce35c

2015-03-24T19:28:20.005Z cpu14:34156)World: 14302: VC opID hostd-0f4f maps to vmkernel opID 5e9ce35c

2015-03-24T19:28:40.004Z cpu0:34128)World: 14302: VC opID hostd-0f4f maps to vmkernel opID 5e9ce35c

2015-03-24T19:29:00.003Z cpu18:34128)World: 14302: VC opID hostd-0f4f maps to vmkernel opID 5e9ce35c

2015-03-24T19:29:18.979Z cpu9:34128)World: 14302: VC opID hostd-0f4f maps to vmkernel opID 5e9ce35c

2015-03-24T19:29:20.004Z cpu19:218925)World: 14302: VC opID hostd-0f4f maps to vmkernel opID 5e9ce35c

2015-03-24T19:29:40.007Z cpu8:34128)World: 14302: VC opID hostd-0f4f maps to vmkernel opID 5e9ce35c

2015-03-24T19:30:00.005Z cpu5:34128)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:30:20.006Z cpu21:34128)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:31:00.007Z cpu16:34363)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:31:20.005Z cpu16:34128)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:31:40.005Z cpu21:34156)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:32:00.003Z cpu17:218928)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:32:20.006Z cpu3:34128)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:32:34.478Z cpu7:34156)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:33:00.004Z cpu12:218925)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:33:20.007Z cpu18:218928)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:33:40.004Z cpu7:34363)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:34:00.004Z cpu19:34128)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:34:20.005Z cpu6:34156)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:34:40.005Z cpu30:34128)World: 14302: VC opID hostd-d855 maps to vmkernel opID b06b129b

2015-03-24T19:35:00.005Z cpu5:34363)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:35:02.388Z cpu31:34363)World: 14302: VC opID hostd-033b maps to vmkernel opID 79dbb45

2015-03-24T19:35:20.005Z cpu14:34363)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:35:40.007Z cpu28:34156)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:35:40.194Z cpu4:34128)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:36:00.003Z cpu0:218925)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:36:20.003Z cpu2:34363)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:36:40.004Z cpu16:34364)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:37:00.003Z cpu20:34363)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:37:20.004Z cpu21:34364)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:37:40.004Z cpu11:34363)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:38:00.004Z cpu21:218925)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:38:20.006Z cpu18:218925)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:38:40.006Z cpu16:34363)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:39:00.006Z cpu22:218925)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:39:20.005Z cpu23:34364)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:40:00.005Z cpu3:34364)World: 14302: VC opID hostd-9b4e maps to vmkernel opID 9a521a04

2015-03-24T19:40:20.005Z cpu18:218925)World: 14302: VC opID hostd-5999 maps to vmkernel opID 3d3bf791

2015-03-24T19:41:00.006Z cpu30:218925)World: 14302: VC opID hostd-5999 maps to vmkernel opID 3d3bf791

2015-03-24T19:41:19.011Z cpu26:34364)World: 14302: VC opID hostd-5999 maps to vmkernel opID 3d3bf791

2015-03-24T19:41:20.004Z cpu1:218923)World: 14302: VC opID hostd-5999 maps to vmkernel opID 3d3bf791

2015-03-24T19:41:40.006Z cpu22:34363)World: 14302: VC opID hostd-5999 maps to vmkernel opID 3d3bf791

2015-03-24T19:42:00.005Z cpu26:34156)World: 14302: VC opID hostd-f9c8 maps to vmkernel opID 692e4d20

2015-03-24T19:42:20.006Z cpu4:218923)World: 14302: VC opID hostd-5999 maps to vmkernel opID 3d3bf791

2015-03-24T19:43:00.006Z cpu14:218923)World: 14302: VC opID hostd-f9c8 maps to vmkernel opID 692e4d20

2015-03-24T19:43:20.006Z cpu15:218925)World: 14302: VC opID hostd-5999 maps to vmkernel opID 3d3bf791

2015-03-24T19:43:40.005Z cpu1:34156)World: 14302: VC opID hostd-5999 maps to vmkernel opID 3d3bf791

2015-03-24T19:44:00.006Z cpu18:34364)World: 14302: VC opID hostd-5999 maps to vmkernel opID 3d3bf791

2015-03-24T19:44:20.005Z cpu13:34363)World: 14302: VC opID hostd-5999 maps to vmkernel opID 3d3bf791

2015-03-24T19:45:00.006Z cpu17:34364)World: 14302: VC opID hostd-1f1e maps to vmkernel opID ce56efd

2015-03-24T19:45:20.005Z cpu18:34364)World: 14302: VC opID hostd-1f1e maps to vmkernel opID ce56efd

2015-03-24T19:46:00.005Z cpu24:218928)World: 14302: VC opID hostd-1f1e maps to vmkernel opID ce56efd

2015-03-24T19:46:20.007Z cpu16:218925)World: 14302: VC opID hostd-1f1e maps to vmkernel opID ce56efd

2015-03-24T19:46:40.006Z cpu6:34363)World: 14302: VC opID hostd-1f1e maps to vmkernel opID ce56efd

2015-03-24T19:47:00.008Z cpu20:218928)World: 14302: VC opID hostd-1f1e maps to vmkernel opID ce56efd

2015-03-24T19:47:20.006Z cpu12:218925)World: 14302: VC opID hostd-1f1e maps to vmkernel opID ce56efd

2015-03-24T19:48:00.006Z cpu0:218928)World: 14302: VC opID hostd-1f1e maps to vmkernel opID ce56efd

2015-03-24T19:48:04.041Z cpu1:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:04.041Z cpu1:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78550070

2015-03-24T19:48:04.041Z cpu1:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:05.043Z cpu1:841376)lsi_mr3: fusionWaitForOutstanding:2326: megasas: [ 0]waiting for 29 commands to complete

2015-03-24T19:48:06.117Z cpu7:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:06.117Z cpu7:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78550955

2015-03-24T19:48:06.117Z cpu7:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:06.789Z cpu23:34156)World: 14302: VC opID hostd-c84b maps to vmkernel opID e7b0638e

2015-03-24T19:48:07.120Z cpu7:841376)lsi_mr3: fusionWaitForOutstanding:2326: megasas: [ 0]waiting for 20 commands to complete

2015-03-24T19:48:08.198Z cpu15:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:08.198Z cpu15:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78552090

2015-03-24T19:48:08.198Z cpu15:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:09.200Z cpu15:841376)lsi_mr3: fusionWaitForOutstanding:2326: megasas: [ 0]waiting for 22 commands to complete

2015-03-24T19:48:10.275Z cpu2:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:10.275Z cpu2:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78553838

2015-03-24T19:48:10.275Z cpu2:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:11.277Z cpu1:841376)lsi_mr3: fusionWaitForOutstanding:2326: megasas: [ 0]waiting for 3 commands to complete

2015-03-24T19:48:12.279Z cpu1:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:12.279Z cpu1:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78553838

2015-03-24T19:48:12.279Z cpu1:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:12.601Z cpu27:33021)VSCSI: 2854: Retry 0 on handle 8215 still in progress after 2 seconds

2015-03-24T19:48:12.601Z cpu27:33021)WARNING: VSCSI: 2842: handle 8228(vscsi0:0):Retry 0 overdue by 2 seconds

2015-03-24T19:48:12.601Z cpu15:898802)VSCSI: 2606: Starting reset handler world 898802/2

2015-03-24T19:48:12.601Z cpu15:898802)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:12.601Z cpu15:898802)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78553838

2015-03-24T19:48:12.601Z cpu15:898802)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:13.338Z cpu31:230525)WARNING: VSCSI: 3565: handle 8228(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:13.338Z cpu31:230525)WARNING: VSCSI: 2487: handle 8228(vscsi0:0):Ignoring double reset

2015-03-24T19:48:14.352Z cpu2:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:14.352Z cpu2:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78555602

2015-03-24T19:48:14.352Z cpu2:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:15.354Z cpu2:841376)lsi_mr3: fusionWaitForOutstanding:2326: megasas: [ 0]waiting for 10 commands to complete

2015-03-24T19:48:16.555Z cpu2:898802)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:16.555Z cpu2:898802)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78560602

2015-03-24T19:48:16.555Z cpu2:898802)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:16.557Z cpu10:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:16.557Z cpu10:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78560602

2015-03-24T19:48:16.557Z cpu10:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:17.557Z cpu2:898802)lsi_mr3: fusionWaitForOutstanding:2326: megasas: [ 0]waiting for 21 commands to complete

2015-03-24T19:48:18.558Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:18.558Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78560602

2015-03-24T19:48:18.558Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:18.615Z cpu27:33021)WARNING: VSCSI: 2842: handle 8214(vscsi0:0):Retry 0 overdue by 2 seconds

2015-03-24T19:48:18.615Z cpu27:33021)VSCSI: 2854: Retry 0 on handle 8222 still in progress after 2 seconds

2015-03-24T19:48:18.615Z cpu27:33021)WARNING: VSCSI: 2842: handle 8224(vscsi0:0):Retry 0 overdue by 2 seconds

2015-03-24T19:48:18.615Z cpu27:33021)VSCSI: 2854: Retry 0 on handle 8226 still in progress after 2 seconds

2015-03-24T19:48:18.615Z cpu19:898815)VSCSI: 2606: Starting reset handler world 898815/3

2015-03-24T19:48:18.615Z cpu19:898815)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:18.615Z cpu19:898815)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78560602

2015-03-24T19:48:18.615Z cpu19:898815)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:19.560Z cpu6:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:19.560Z cpu6:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78560602

2015-03-24T19:48:19.560Z cpu6:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:19.611Z cpu29:230439)WARNING: VSCSI: 3565: handle 8224(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:19.611Z cpu29:230439)WARNING: VSCSI: 2487: handle 8224(vscsi0:0):Ignoring double reset

2015-03-24T19:48:19.669Z cpu20:230066)WARNING: VSCSI: 3565: handle 8214(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:19.669Z cpu20:230066)WARNING: VSCSI: 2487: handle 8214(vscsi0:0):Ignoring double reset

2015-03-24T19:48:19.701Z cpu18:230481)WARNING: VSCSI: 3565: handle 8226(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:19.701Z cpu18:230481)WARNING: VSCSI: 2487: handle 8226(vscsi0:0):Ignoring double reset

2015-03-24T19:48:20.006Z cpu28:218928)World: 14302: VC opID hostd-d55b maps to vmkernel opID 66b8825c

2015-03-24T19:48:22.665Z cpu6:898802)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:22.665Z cpu6:898802)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78562065

2015-03-24T19:48:22.665Z cpu6:898802)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:22.665Z cpu26:898815)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:22.665Z cpu26:898815)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78562065

2015-03-24T19:48:22.665Z cpu26:898815)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:22.681Z cpu28:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:22.681Z cpu28:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78562065

2015-03-24T19:48:22.681Z cpu28:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:23.667Z cpu10:898802)lsi_mr3: fusionWaitForOutstanding:2326: megasas: [ 0]waiting for 9 commands to complete

2015-03-24T19:48:25.459Z cpu24:34364)World: 14302: VC opID hostd-bd58 maps to vmkernel opID e6e5b784

2015-03-24T19:48:25.629Z cpu25:33021)VSCSI: 2854: Retry 0 on handle 8216 still in progress after 2 seconds

2015-03-24T19:48:25.629Z cpu25:33021)VSCSI: 2854: Retry 0 on handle 8220 still in progress after 2 seconds

2015-03-24T19:48:25.716Z cpu24:230333)WARNING: VSCSI: 3565: handle 8220(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:25.716Z cpu24:230333)WARNING: VSCSI: 2487: handle 8220(vscsi0:0):Ignoring double reset

2015-03-24T19:48:26.759Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:26.759Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78564349

2015-03-24T19:48:26.759Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:26.776Z cpu17:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:26.776Z cpu17:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78564349

2015-03-24T19:48:26.776Z cpu17:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:26.791Z cpu11:898815)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:26.791Z cpu11:898815)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78564349

2015-03-24T19:48:26.791Z cpu11:898815)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:27.482Z cpu23:218923)World: 14302: VC opID hostd-70f5 maps to vmkernel opID 4bf46441

2015-03-24T19:48:27.761Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:27.761Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78564349

2015-03-24T19:48:27.761Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:29.637Z cpu27:33021)VSCSI: 2854: Retry 0 on handle 8218 still in progress after 2 seconds

2015-03-24T19:48:29.637Z cpu27:33021)VSCSI: 2854: Retry 0 on handle 8221 still in progress after 2 seconds

2015-03-24T19:48:29.853Z cpu26:230246)WARNING: VSCSI: 3565: handle 8218(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:29.853Z cpu26:230246)WARNING: VSCSI: 2487: handle 8218(vscsi0:0):Ignoring double reset

2015-03-24T19:48:30.854Z cpu1:898815)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:30.854Z cpu1:898815)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78568636

2015-03-24T19:48:30.854Z cpu1:898815)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:31.856Z cpu1:898815)lsi_mr3: fusionWaitForOutstanding:2326: megasas: [ 0]waiting for 2 commands to complete

2015-03-24T19:48:35.768Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:35.768Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:35.768Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:35.775Z cpu8:898815)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:35.775Z cpu8:898815)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:35.775Z cpu8:898815)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:35.820Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:35.820Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:35.820Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:36.771Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:36.771Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:36.771Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:37.773Z cpu8:898815)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:37.773Z cpu8:898815)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:37.773Z cpu8:898815)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:38.652Z cpu17:33021)WARNING: VSCSI: 2842: handle 8217(vscsi0:0):Retry 0 overdue by 2 seconds

2015-03-24T19:48:38.652Z cpu17:33021)WARNING: VSCSI: 2842: handle 8219(vscsi0:0):Retry 0 overdue by 2 seconds

2015-03-24T19:48:38.652Z cpu17:33021)WARNING: VSCSI: 2842: handle 8223(vscsi0:0):Retry 0 overdue by 2 seconds

2015-03-24T19:48:38.652Z cpu17:33021)WARNING: VSCSI: 2842: handle 8225(vscsi0:0):Retry 0 overdue by 2 seconds

2015-03-24T19:48:38.652Z cpu17:33021)WARNING: VSCSI: 2842: handle 8229(vscsi0:0):Retry 0 overdue by 2 seconds

2015-03-24T19:48:38.652Z cpu17:33021)VSCSI: 2854: Retry 0 on handle 8231 still in progress after 2 seconds

2015-03-24T19:48:38.652Z cpu26:898854)VSCSI: 2606: Starting reset handler world 898854/4

2015-03-24T19:48:38.652Z cpu26:898854)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:38.652Z cpu26:898854)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:38.652Z cpu26:898854)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:38.773Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:38.773Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:38.773Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:39.027Z cpu2:230219)WARNING: VSCSI: 3565: handle 8217(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.027Z cpu2:230219)WARNING: VSCSI: 2487: handle 8217(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.254Z cpu19:230066)WARNING: VSCSI: 3565: handle 8214(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.254Z cpu19:230066)WARNING: VSCSI: 2487: handle 8214(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.276Z cpu20:230481)WARNING: VSCSI: 3565: handle 8226(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.276Z cpu20:230481)WARNING: VSCSI: 2487: handle 8226(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.286Z cpu24:230399)WARNING: VSCSI: 3565: handle 8222(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.286Z cpu24:230399)WARNING: VSCSI: 2487: handle 8222(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.289Z cpu6:230461)WARNING: VSCSI: 3565: handle 8225(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.289Z cpu6:230461)WARNING: VSCSI: 2487: handle 8225(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.314Z cpu31:230439)WARNING: VSCSI: 3565: handle 8224(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.314Z cpu31:230439)WARNING: VSCSI: 2487: handle 8224(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.386Z cpu13:230419)WARNING: VSCSI: 3565: handle 8223(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.386Z cpu13:230419)WARNING: VSCSI: 2487: handle 8223(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.390Z cpu18:230192)WARNING: VSCSI: 3565: handle 8216(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.390Z cpu18:230192)WARNING: VSCSI: 2487: handle 8216(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.407Z cpu0:230311)WARNING: VSCSI: 3565: handle 8219(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.407Z cpu0:230311)WARNING: VSCSI: 2487: handle 8219(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.462Z cpu3:230357)WARNING: VSCSI: 3565: handle 8221(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.462Z cpu3:230357)WARNING: VSCSI: 2487: handle 8221(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.592Z cpu22:230525)WARNING: VSCSI: 3565: handle 8228(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.592Z cpu22:230525)WARNING: VSCSI: 2487: handle 8228(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.654Z cpu17:33021)VSCSI: 2854: Retry 0 on handle 8213 still in progress after 2 seconds

2015-03-24T19:48:39.654Z cpu17:33021)WARNING: VSCSI: 2842: handle 8227(vscsi0:0):Retry 0 overdue by 2 seconds

2015-03-24T19:48:39.654Z cpu16:898857)VSCSI: 2606: Starting reset handler world 898857/5

2015-03-24T19:48:39.654Z cpu16:898857)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:39.654Z cpu16:898857)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:39.654Z cpu16:898857)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:39.728Z cpu10:230616)WARNING: VSCSI: 3565: handle 8231(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.728Z cpu10:230616)WARNING: VSCSI: 2487: handle 8231(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.754Z cpu8:230545)WARNING: VSCSI: 3565: handle 8229(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.754Z cpu8:230545)WARNING: VSCSI: 2487: handle 8229(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.775Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:39.775Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:39.775Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:39.802Z cpu23:230333)WARNING: VSCSI: 3565: handle 8220(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.802Z cpu23:230333)WARNING: VSCSI: 2487: handle 8220(vscsi0:0):Ignoring double reset

2015-03-24T19:48:39.848Z cpu14:230013)WARNING: VSCSI: 3565: handle 8213(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:39.848Z cpu14:230013)WARNING: VSCSI: 2487: handle 8213(vscsi0:0):Ignoring double reset

2015-03-24T19:48:40.004Z cpu2:218928)World: 14302: VC opID hostd-3d61 maps to vmkernel opID 3fa348a8

2015-03-24T19:48:40.110Z cpu28:230246)WARNING: VSCSI: 3565: handle 8218(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:40.110Z cpu28:230246)WARNING: VSCSI: 2487: handle 8218(vscsi0:0):Ignoring double reset

2015-03-24T19:48:40.546Z cpu22:230501)WARNING: VSCSI: 3565: handle 8227(vscsi0:0):WaitForCIF: Issuing reset;  number of CIF:1

2015-03-24T19:48:40.546Z cpu22:230501)WARNING: VSCSI: 2487: handle 8227(vscsi0:0):Ignoring double reset

2015-03-24T19:48:40.657Z cpu6:898861)VSCSI: 2606: Starting reset handler world 898861/6

2015-03-24T19:48:40.657Z cpu6:898861)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:40.657Z cpu6:898861)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:40.657Z cpu6:898861)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:40.777Z cpu8:898815)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:40.777Z cpu8:898815)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:40.777Z cpu8:898815)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:41.660Z cpu3:898864)VSCSI: 2606: Starting reset handler world 898864/7

2015-03-24T19:48:41.660Z cpu3:898864)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:41.660Z cpu3:898864)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:41.660Z cpu3:898864)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:41.779Z cpu26:898854)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:41.779Z cpu26:898854)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:41.779Z cpu26:898854)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:42.664Z cpu16:898867)VSCSI: 2606: Starting reset handler world 898867/8

2015-03-24T19:48:42.664Z cpu16:898867)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:42.664Z cpu16:898867)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:42.664Z cpu16:898867)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:42.782Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:42.782Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:42.782Z cpu10:898802)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:43.667Z cpu6:898870)VSCSI: 2606: Starting reset handler world 898870/9

2015-03-24T19:48:43.667Z cpu6:898870)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:43.667Z cpu6:898870)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:43.667Z cpu6:898870)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:43.784Z cpu16:898857)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:43.784Z cpu16:898857)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:43.784Z cpu16:898857)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:44.670Z cpu16:898873)VSCSI: 2606: Starting reset handler world 898873/10

2015-03-24T19:48:44.670Z cpu16:898873)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:44.670Z cpu16:898873)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:44.670Z cpu16:898873)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:44.786Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:44.786Z cpu18:841

376)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:44.786Z cpu18:841376)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:45.673Z cpu29:898876)VSCSI: 2606: Starting reset handler world 898876/11

2015-03-24T19:48:45.673Z cpu29:898876)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:45.673Z cpu29:898876)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589094

2015-03-24T19:48:45.673Z cpu29:898876)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:46.640Z cpu27:841399)NMP: nmp_ThrottleLogForDevice:2322: Cmd 0x2a (0x4136825f0d40, 32805) to dev "naa.6003048016ba11001c38266a07a2ea99" on path "vmhba2:C2:T0:L0" Failed: H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL

2015-03-24T19:48:46.640Z cpu27:841399)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.6003048016ba11001c38266a07a2ea99" state in doubt; requested fast path state update...

2015-03-24T19:48:46.640Z cpu27:841399)ScsiDeviceIO: 2338: Cmd(0x4136825f0d40) 0x2a, CmdSN 0x10206ec from world 32805 to dev "naa.6003048016ba11001c38266a07a2ea99" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

2015-03-24T19:48:46.640Z cpu27:841399)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt abort for device: vmhba2:C2:T0:L0

2015-03-24T19:48:46.640Z cpu27:841399)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:56.878Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:56.878Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589135

2015-03-24T19:48:56.878Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:56.878Z cpu25:32951)HBX: 2958: Waiting for timed out [HB state abcdef02 offset 3620864 gen 45 stampUS 323625231344 uuid 550ccfa3-ea39196c-170c-002590fd2554 jrnl <FB 285200> drv 14.60] on vol 'vms1'

2015-03-24T19:48:57.879Z cpu8:32856)lsi_mr3: fusionWaitForOutstanding:2326: megasas: [ 0]waiting for 6 commands to complete

2015-03-24T19:48:58.198Z cpu27:841399)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt abort for device: vmhba2:C2:T0:L0

2015-03-24T19:48:58.198Z cpu27:841399)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:58.879Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:48:58.879Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589135

2015-03-24T19:48:58.879Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:48:59.881Z cpu27:841399)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt abort for device: vmhba2:C2:T0:L0

2015-03-24T19:48:59.881Z cpu27:841399)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:49:00.006Z cpu26:34364)World: 14302: VC opID hostd-ee2b maps to vmkernel opID c3f76b19

2015-03-24T19:49:00.201Z cpu30:898908)ScsiCore: 63: Starting taskmgmt handler world 898908/2

2015-03-24T19:49:00.202Z cpu30:898908)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt abort for device: vmhba2:C2:T0:L0

2015-03-24T19:49:00.202Z cpu30:898908)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:49:00.884Z cpu27:841399)ScsiCore: 98: Stopping taskMgmt handler world 8413991

2015-03-24T19:49:01.886Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:49:01.886Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589135

2015-03-24T19:49:01.886Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:49:03.890Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0

2015-03-24T19:49:03.890Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 78589135

2015-03-24T19:49:03.890Z cpu8:32856)lsi_mr3: mfi_TaskMgmt:262: ABORT

2015-03-24T19:49:04.894Z cpu8:32856)HBX: 270: Reclaimed heartbeat for volume 54a56af2-3607b1b0-852d-002590fd2554 (vms1): [Timeout] Offset 3620864

2015-03-24T19:49:04.894Z cpu8:32856)[HB state abcdef02 offset 3620864 gen 45 stampUS 323633247085 uuid 550ccfa3-ea39196c-170c-002590fd2554 jrnl <FB 285200> drv 14.60]

2015-03-24T19:49:20.005Z cpu0:34364)World: 14302: VC opID hostd-8fa6 maps to vmkernel opID afc1903d

2015-03-24T19:49:40.006Z cpu1:218928)World: 14302: VC opID hostd-8d65 maps to vmkernel opID 5ff34d07

2015-03-24T19:49:40.239Z cpu30:218923)World: 14302: VC opID hostd-3944 maps to vmkernel opID 210f7a36

2015-03-24T19:50:00.003Z cpu0:218925)World: 14302: VC opID hostd-8b55 maps to vmkernel opID 4147aca5

2015-03-24T19:50:20.003Z cpu0:218925)World: 14302: VC opID hostd-f991 maps to vmkernel opID f86b39d8

Reply
0 Kudos
brunofernandez1

maybe this can help you:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=289902

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=103038...

SCSI Commands:

http://t10.t10.org/ftp/t10/document.04/04-262r8.pdf

im actually on the way to a customer so I don't have the time actually...

------------------------------------------------------------------------------- If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards from Switzerland, B. Fernandez http://vpxa.info/
Reply
0 Kudos
cykVM
Expert
Expert

Is there probably some sort of power management active in either BIOS or VMWare host config?

As you stated using "consumer SSD drives", there were a lot of these on the market with somehow faulty/buggy firmware, have you checked vendor's support for any fw-updates for the drives?

I would still check driver version for the LSI 3108 controller in use. Sometimes even a driver (version) downgrade helps. The latest (VMWare certified) driver version for you LSI seems to be 6.605.00.00.1 for VMWare 5.5, see https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI55-LSI-SCSI-MEGARAID-SAS-660500001VMW&...

Reply
0 Kudos
muser12
Contributor
Contributor

There is no power management setup in either VMware or the host config that I'm aware of. I setup the machines myself from scratch, and I didn't do anything power management related.

I can try looking for an hdd firmware update, and looking at the LSI driver version.

The question that hasn't been addressed yet is the 6 hour cycle. This has been going on for days, and on all 3 hosts this still happens nearly exactly every 6 hours. That tells me there is some job/service running that is playing a role in all of this. There are only a few cronjobs set (whatever default ones wre there), and they dont run on 6 hour cycles. I'm unsure where else to check for such things, any ideas on that?

Reply
0 Kudos
brunofernandez1

i dont have a lot of knowhow with cronjobs but as I know you can add jobs for system and for different users...

maybe the vpxa has a cronjob? but i think this user is only created if the esxi server is connected to a vcenter server...

------------------------------------------------------------------------------- If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards from Switzerland, B. Fernandez http://vpxa.info/
Reply
0 Kudos
cykVM
Expert
Expert

The power management thing is just from experience with some newer HP Proliants (Gen8/Gen9), they usually have the BIOS configured for "balanced" mode and with VMWare you have to switch to "static - high performance" (which is power management turned off). Not sure if it's the same with Supermicro motherboards. Same applies to host configuration which thereafter needs to be switched to "High performance". With some sort of power management turned on it could even happen, that the PCIe port the LSI sits in gets turned off or throttled down to save power.

And for the driver up-/downgrade: I had massive problems with a driver for a HP SmartArray B320i controller after I upgraded from VMware 5.5 Update 1 to Update 2 version. Could not fix it and went back to Update 1 and downgraded the driver to previous version. Now everything is back to normal.

Did you use some kind of customized VMWare installation ISO for installation on your hosts? Maybe it's a monitoring tool for the LSI controller running mad?

Maybe even this is another place to take a closer look at: Creating Scheduled Tasks in the vSphere Web Client | Tech Communications Video Blog - VMware Blogs

Reply
0 Kudos
muser12
Contributor
Contributor

Nope, no vCenter server, and I've checked the existing cronjobs. There are only a few, and none would appear to be related to this as they don't run on a 6 hour cycle or anything similar.

Reply
0 Kudos
muser12
Contributor
Contributor

I'm not aware of any such power management for my SuperMicro board, and it does not appear in the documentation. However, I emailed SM support do see if they have any ideas.

No customized ISO, just the ESXi 5.5 Update 2 download from VMware, with the latest patches installed after.

I looked into scheduled tasks, but those are only supported with vCenter, which I don't have. Are you aware of any task scheduling mechanism outside of cron?

Reply
0 Kudos
nui2011nui2011
Contributor
Contributor

It should be performance issue. I think some jobs run every 6 hours, like Symantec defination file update, Data backup and etc . U can try to install VCOPS to monitor each datastore and each VMs IOPS, then  check which one have the highest request.

Reply
0 Kudos
srwsol
Hot Shot
Hot Shot

Did you ever get a resolution to this?  I'm having some issues with lost connectivity as well and have some of the same hardware as you.  However, in my case the problem seems to occur when I start or stop certain VMs, rather than on a timed schedule.  I am also using a LSI Megaraid controller with SSD drives.   In my case the whole thing crashed on me at least once, and I was showing hardware errors in the Bios event log some of the time.  Here's a link to the thread:

Lost access to local datastore

I'm curious if there are anymore similarities between our two issues.

Reply
0 Kudos
Cannoli
Contributor
Contributor

We're seeing the same issue with the LSI 3108 controller and Intel enterprise SSD's . We first saw console messages in a FreeBSD VM that told us it could no longer talk to it's disk which had us try to determine what was going on. Working with Supermicro, LSI and VMware, we determined the LSI 3108 controller was "timing out" where all I/O comes to a complete stop on the controller.  While it was the FreeBSD VM console that alerted us to the issue during our initial build-out of the server cluster, the vmkernel log file confirmed the LSI 3108 controller that backs an 8 disk SSD RAID is timing out then resetting. We've been able to cause it to "time out" at will by powering up or resetting 5 VM's at the same time that live on the SSD RAID. Not only does the vmkernel log display the loss of communications to the controller, the LED activity on the drives is non-existent for ~30-40 seconds.

We've tried multiple LSI controller firmware packages (even beta firmware from LSI), various versions of the VMware drivers for the controller, hardware BIOS settings for the system and the controller, including power management and C-states.  You name it, we've tried it all without success.

I have an open ticket with Supermicro and VMware to solve this issue. I'll post more as I have information.

Reply
0 Kudos
kashifkarar01
Enthusiast
Enthusiast

Please check the following KB:

kb.vmware.com/kb/1021187

Regards,

Kashif

Reply
0 Kudos