Hi I'm receiving the following error when attempting to create a disk group in vSAN :- Failed to reserve disk naa.55cd2e404b66fd08 with exception: Failed to reserve disk naa.55cd2e404b66fd08 with exception: Reserve failed with error code: -1. The disk in question is my SSD. Disks are all eligible according to vSAN.
Let me guess - you are running nested VSAN on top of VSAN?
If so, you need fake SCSI reservations - check this William Lam post out - http://www.virtuallyghetto.com/2013/11/how-to-run-nested-esxi-on-top-of-vsan.html
nope, it's a physical server. with physical disks
Dunno the cause then I'm afraid. I only ever saw this with nested environments.
yes, same here
Only seen in Virtual Environments.
Should not be the case though, but can you check to see if there is any stale partition exists on this SSD disk and cleanup before re-attempting to create the diskgroup.
# partedUtil getptbl /vmfs/devices/disks/<naa.xxxxxx>
and also other thing to look for is
# esxcli vsan storage list --device=<naa.xxxxxx>
Thanks,
partedUtil reports no partitions on the disk.
The output of # esxcli vsan storage list --device=<naa.xxxxxx> was this -
[root@localhost:~] esxcli vsan storage list --device+ /vmfs/devices/disks/naa.55cd2e404b66fd08
Unable to find device with name: /vmfs/devices/disks/naa.55cd2e404b66fd08
Thanks,
do not see any issue in here... sorry couldn't be of much help
one other thing would check is
# esxcfg-scsidevs -a (gives controller/disk drive info)
# esxcfg-scsidevs -c ( device info)
# esxcfg-scsidevs -ld <device> ( device parameters)
and if disks/controller are in vmware HCL would suggest to file a defect with support logs, so we could analyze whats going on ?
We had a similar issue and it was because the drive was MBR and we had to change it to GPT. My apologies, I cannot find the link for instructions. Thank you, Zach.
hmm
k, if this is the case then the instruction on changing is easier
first check
# partedUtil getptbl /vmfs/devices/disks/<naa.xxxxxx>
to see what it reports ie "gpt" or "msdos"
if its later, then you have to clear the partition table entries from the device. you can do this by dd'ng out zeros to the the **first** and **last** 34 sectors of the device
this will get you an unused disk and you should be good to go.
PS: you have to clear both first and last 34 sectors if you have GPT header, since backup partition is stored on last 34 sectors
eg:
partedUtil getptbl /vmfs/devices/disks/naa.60060160153020006c952ce0dbcde011
msdos <====
1305 255 63 20971520
dd if=/dev/zero of=/vmfs/devices/disks/naa.60000970000194900650533030453034 count=34 conv=notrunc
34+0 records in
34+0 records out
dd if=/dev/zero of=/vmfs/devices/disks/naa.60000970000194900650533030453034 count=34 conv=notrunc seek=20972126
34+0 records in
34+0 records out
partedUtil getptbl /vmfs/devices/disks/naa.60000970000194900650533030453034
unknown <====
1305 255 63 20972160
Thanks,
We have zero'd the first and last 34sectors and all the partitions (2 X magnetic + 1 x SSD) all show as unknown.
We receive the same "failed to reserve" error when trying to claim the disks.
[root@localhost:~] partedUtil getptbl /vmfs/devices/disks/naa.5000c50062e1f8ef
unknown
121533 255 63 1952428032
[root@localhost:~] partedUtil getptbl /vmfs/devices/disks/naa.5000c50062b52a23
unknown
121533 255 63 1952428032
[root@localhost:~] partedUtil getptbl /vmfs/devices/disks/naa.55cd2e404b66fd08
unknown
24296 255 63 390328320
A general system error occurred: Failed to reserve disk naa.55cd2e404b66fd08 with exception: Failed to reserve disk naa.55cd2e404b66fd08 with exception: Reserve failed with error code: -1
hmm,
can you post the
# esxcfg-scsidevs -a
# esxcfg-scsidevs -ld naa.5000c50062b52a23
# esxcfg-mpath -ld naa.5000c50062b52a23
the type of controller/disks could give some insight on whats going on. the build is 5.5U2/U3 right ?
Thanks,
[root@localhost:~] esxcfg-scsidevs -a
vmhba0 ata_piix link-n/a sata.vmhba0 (0000:00:1f.2) Intel Corporation ICH10 4 port SATA IDE Controller
vmhba1 ata_piix link-n/a sata.vmhba1 (0000:00:1f.5) Intel Corporation ICH10 2 port SATA IDE Controller
vmhba2 aacraid link-n/a pscsi.vmhba2 (0000:05:00.0) Adaptec AACRAID
vmhba32 usb-storage link-n/a usb.vmhba32 () USB
vmhba33 ata_piix link-n/a sata.vmhba33 (0000:00:1f.2) Intel Corporation ICH10 4 port SATA IDE Controller
vmhba34 ata_piix link-n/a sata.vmhba34 (0000:00:1f.5) Intel Corporation ICH10 2 port SATA IDE Controller
[root@localhost:~] esxcfg-scsidevs -ld naa.55cd2e404b6625ce
naa.55cd2e404b6625ce
Device Type: Direct-Access
Size: 190590 MB
Display Name: Local INTEL Disk (naa.55cd2e404b6625ce)
Multipath Plugin: NMP
Console Device: /vmfs/devices/disks/naa.55cd2e404b6625ce
Devfs Path: /vmfs/devices/disks/naa.55cd2e404b6625ce
Vendor: INTEL Model: SSDSC2BA200G3 Revis: 5DV1
SCSI Level: 5 Is Pseudo: false Status: on
Is RDM Capable: false Is Removable: false
Is Local: true Is SSD: true
Other Names:
vml.020000000055cd2e404b6625ce535344534332
VAAI Status: unsupported
[root@localhost:~] esxcfg-mpath -ld naa.55cd2e404b6625ce
pscsi.vmhba2-pscsi.1:3-naa.55cd2e404b6625ce
Runtime Name: vmhba2:C1:T3:L0
Device: naa.55cd2e404b6625ce
Device Display Name: Local INTEL Disk (naa.55cd2e404b6625ce)
Adapter: vmhba2 Channel: 1 Target: 3 LUN: 0
Adapter Identifier: pscsi.vmhba2
Target Identifier: pscsi.1:3
Plugin: NMP
State: active
Transport: parallel
The build is 6.0.0
We cant even create a normal datastore on these SSD's
Call "HostDatastoreSystem.CreateVmfsDatastore" for object "datastoreSystem-54" on vCenter Server "vcenter" failed.
Operation failed, diagnostics report: Unable to create Filesystem, please see VMkernel log for more details: ATS on device /dev/disks/naa.55cd2e404b6625ce:1: not supported
This happens across three hosts, all three hosts were fine before upgrading to ESX 6.0 but since upgrading the SSDs have started to display this behavior. If we mark a HDD as SSD we can bring up vSAN, the HDD's are connected to the same RAID card as the SSDs and we see this same behaviour across all three hosts.
See this in dmesg:
2015-04-01T21:58:49.867Z cpu14:32821)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.55cd2e404b6625ce" state in doubt; requested fast path state update...
2015-04-01T21:58:49.867Z cpu14:32821)ScsiDeviceIO: 2646: Cmd(0x43a5804eb940) 0x16, CmdSN 0x3854 from world 0 to dev "naa.55cd2e404b6625ce" failed H:0x7 D:0x2 P:0x0 Possible sense data: 0x5 0x20 0x0.
2015-04-01T21:58:49.987Z cpu18:35109)NMP: nmp_PathDetermineFailure:2901: SCSI cmd RESERVE failed on path vmhba2:C1:T3:L0, reservation state on device naa.55cd2e404b6625ce is unknown.
2015-04-01T21:58:50.105Z cpu15:35110)NMP: nmp_PathDetermineFailure:2901: SCSI cmd RESERVE failed on path vmhba2:C1:T3:L0, reservation state on device naa.55cd2e404b6625ce is unknown.
2015-04-01T21:58:50.140Z cpu17:35072 opID=517541e1)LVM: 9274: LVMProbeDevice failed on (3468846912, naa.55cd2e404b6625ce:1): Device does not contain a logical volume
2015-04-01T21:58:56.499Z cpu19:35902)FSS: 5327: No FS driver claimed device 'control': No filesystem on the device
2015-04-01T21:58:56.829Z cpu19:35902)FSS: 5327: No FS driver claimed device 'naa.55cd2e404b6625ce:1': No filesystem on the device
2015-04-01T21:58:56.829Z cpu19:35902)VC: 3551: Device rescan time 206 msec (total number of devices 5)
2015-04-01T21:58:56.829Z cpu19:35902)VC: 3554: Filesystem probe time 352 msec (devices probed 5 of 5)
2015-04-01T21:58:56.829Z cpu19:35902)VC: 3556: Refresh open volume time 0 msec
2015-04-01T21:59:46.596Z cpu2:33324)NMP: nmp_ResetDeviceLogThrottling:3345: last error status from device naa.55cd2e404b6625ce repeated 2 times
2015-04-01T21:59:46.596Z cpu2:33324)NMP: nmp_ResetDeviceLogThrottling:3345: last error status from device mpx.vmhba32:C0:T0:L0 repeated 29 times
2015-04-01T22:01:01.646Z cpu14:32821)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x9e (0x43a58046cbc0, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2015-04-01T22:01:46.596Z cpu4:33324)NMP: nmp_ResetDeviceLogThrottling:3345: last error status from device mpx.vmhba32:C0:T0:L0 repeated 1 times
Thanks for the cmd output
this output explains whats happening.
> vmhba2 aacraid link-n/a pscsi.vmhba2 (0000:05:00.0) Adaptec AACRAID
> Adapter Identifier: pscsi.vmhba2
> Target Identifier: pscsi.1:3 <=====
> Plugin: NMP
> State: active
> Transport: parallel <=====
Also if i remember correctly this is not supported in vsphere either for PSCSI-RAID
( http://www.vmware.com/resources/compatibility/search.php?deviceCategory=io )
aacraid is a scsi driver and it does not have SAS transport capability. Thats why its transport type is listed as PSCSI.
for VSAN we require SAS/SAS-SATA RAID (you can see the complete list in VSAN HCL)
Do not use this RAID-controller mode JBOD. Create for each disk array level 0 and vSAN will work.
I noted RAID .. but it is JBOD controller.
But what I did find .. is that if I create VMFS file system.. copy over 10GB or so (maybe just partition was sufficient but I did a copy to test health of disk) I could then delete the datastore... and then vSAN would allow consumption.
vSAN back to ... well. next step to setup again.