Currently we have 2 Datastores which we cannot bring online in our lab, they were working yesterday. I thought that their partition info might have gotten over-written by a windows machine but Ive checked the partitions and they report ok after following this article. http://kb.vmware.com/kb/2046610
Both partitions are coming from the same netapp aggregate and controller as working datastores and are mapped to the same iSCSI initiators. When you attempt to mount them in vsphere client you get the following error:
Call "HostStorageSystem.MountVmfsVolume" for Object "storageSystem" on ESXi 172.16.210.19" failed.
Operation failed, diagnostics report: Sysinfo error on operation returned status: Timeout. Please see the VMKernel.log for detailed information.
In the VMkernel log you just get this:
2015-01-16T10:40:36.584Z cpu16:8208)ScsiDeviceIO: 2331: Cmd(0x4124003b2880) 0x16, CmdSN 0x1cf0 from world 0 to dev "naa.60a9800043346c626b4a793543343973" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2015-01-16T10:40:36.584Z cpu6:10287)LVM: 11710: Failed to open device naa.60a9800043346c626b4a793543343973:1
2015-01-16T10:43:33.396Z cpu0:8224)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x41244237c740, 0) to dev "mpx.vmhba35:C0:T0:L0" on path "vmhba35:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2015-01-16T10:43:33.396Z cpu0:8224)ScsiDeviceIO: 2331: Cmd(0x41244237c740) 0x1a, CmdSN 0x1cfd from world 0 to dev "mpx.vmhba35:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
2015-01-16T10:48:33.401Z cpu28:8220)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x412441596e00, 0) to dev "mpx.vmhba35:C0:T0:L0" on path "vmhba35:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2015-01-16T10:48:33.401Z cpu28:8220)ScsiDeviceIO: 2331: Cmd(0x412441596e00) 0x1a, CmdSN 0x1d10 from world 0 to dev "mpx.vmhba35:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
Device naa.60a9800043346c626b4a793543343973:1 is VMstore2
/var/log # vsish
/> cd vmkModules/lvmdriver/unresolved/devices/
/vmkModules/lvmdriver/unresolved/devices/> ls
0#naa.60a9800043346c626b4a793543343973:1/
0#naa.60a9800043346c626b4b315142424361:1/
/vmkModules/lvmdriver/unresolved/devices/> cat 0#naa.60a9800043346c626b4a793543343973:1/properties
Unresolved device information {
VMK name:naa.60a9800043346c626b4a793543343973:1
LV name:536215f8-a64160c8-06e1-d4ae52901623
LV State:1
VMFS3 UUID (First extent only):536215fb-a58b7b2c-17af-d4ae52901623
VMFS3 label (First extent only):NetApp_VMstore2
Reason: Native unmounted volume
Extent address start (MB):0
Extent address end (MB):2096895
Volume total size (MB):2096896
}
/vmkModules/lvmdriver/unresolved/devices/
Here is 3 important KBs to check and discover SCSI problems via SCSI sense codes:
As I checked, there is a problem on your SCSI device.
Yup I see that now:
VMK_SCSI_HOST_ABORT = 0x05 or 0x5 = This status is returned if the driver has to abort commands in-flight to the target. This can occur due to a command timeout or parity error in the frame.
So if I am reading this right the SCSI command aborted because of something the NetApp did?
