Re: Failure to create disk group - Failed to reser...

itinferv · ‎03-24-2015

Hi I'm receiving the following error when attempting to create a disk group in vSAN :- Failed to reserve disk naa.55cd2e404b66fd08 with exception: Failed to reserve disk naa.55cd2e404b66fd08 with exception: Reserve failed with error code: -1. The disk in question is my SSD. Disks are all eligible according to vSAN.

CHogan · ‎03-24-2015

Let me guess - you are running nested VSAN on top of VSAN?

If so, you need fake SCSI reservations - check this William Lam post out - http://www.virtuallyghetto.com/2013/11/how-to-run-nested-esxi-on-top-of-vsan.html

http://cormachogan.com

itinferv · ‎03-24-2015

nope, it's a physical server. with physical disks

CHogan · ‎03-24-2015

Dunno the cause then I'm afraid. I only ever saw this with nested environments.

http://cormachogan.com

ramakrishnak · ‎03-24-2015

yes, same here

Only seen in Virtual Environments.

ramakrishnak · ‎03-24-2015

Should not be the case though, but can you check to see if there is any stale partition exists on this SSD disk and cleanup before re-attempting to create the diskgroup.

# partedUtil getptbl /vmfs/devices/disks/<naa.xxxxxx>

and also other thing to look for is

# esxcli vsan storage list --device=<naa.xxxxxx>

Thanks,

itinferv · ‎03-25-2015

partedUtil reports no partitions on the disk.

The output of # esxcli vsan storage list --device=<naa.xxxxxx> was this -

[root@localhost:~] esxcli vsan storage list --device+ /vmfs/devices/disks/naa.55cd2e404b66fd08

Unable to find device with name: /vmfs/devices/disks/naa.55cd2e404b66fd08

ramakrishnak · ‎03-25-2015

Thanks,

do not see any issue in here... sorry couldn't be of much help

one other thing would check is

# esxcfg-scsidevs -a (gives controller/disk drive info)

# esxcfg-scsidevs -c ( device info)

# esxcfg-scsidevs -ld <device> ( device parameters)

and if disks/controller are in vmware HCL would suggest to file a defect with support logs, so we could analyze whats going on ?

zdickinson · ‎03-27-2015

We had a similar issue and it was because the drive was MBR and we had to change it to GPT. My apologies, I cannot find the link for instructions. Thank you, Zach.

ramakrishnak · ‎03-27-2015

hmm

k, if this is the case then the instruction on changing is easier

first check

# partedUtil getptbl /vmfs/devices/disks/<naa.xxxxxx>

to see what it reports ie "gpt" or "msdos"

if its later, then you have to clear the partition table entries from the device. you can do this by dd'ng out zeros to the the **first** and **last** 34 sectors of the device

this will get you an unused disk and you should be good to go.

PS: you have to clear both first and last 34 sectors if you have GPT header, since backup partition is stored on last 34 sectors

eg:

partedUtil getptbl /vmfs/devices/disks/naa.60060160153020006c952ce0dbcde011

msdos <====

1305 255 63 20971520

dd if=/dev/zero of=/vmfs/devices/disks/naa.60000970000194900650533030453034 count=34 conv=notrunc

34+0 records in

34+0 records out

dd if=/dev/zero of=/vmfs/devices/disks/naa.60000970000194900650533030453034 count=34 conv=notrunc seek=20972126

34+0 records in

34+0 records out

partedUtil getptbl /vmfs/devices/disks/naa.60000970000194900650533030453034

unknown <====

1305 255 63 20972160

Thanks,

sengland · ‎03-30-2015

We have zero'd the first and last 34sectors and all the partitions (2 X magnetic + 1 x SSD) all show as unknown.

We receive the same "failed to reserve" error when trying to claim the disks.

[root@localhost:~] partedUtil getptbl /vmfs/devices/disks/naa.5000c50062e1f8ef

unknown

121533 255 63 1952428032

[root@localhost:~] partedUtil getptbl /vmfs/devices/disks/naa.5000c50062b52a23

unknown

121533 255 63 1952428032

[root@localhost:~] partedUtil getptbl /vmfs/devices/disks/naa.55cd2e404b66fd08

unknown

24296 255 63 390328320

A general system error occurred: Failed to reserve disk naa.55cd2e404b66fd08 with exception: Failed to reserve disk naa.55cd2e404b66fd08 with exception: Reserve failed with error code: -1

ramakrishnak · ‎03-30-2015

hmm,

can you post the

# esxcfg-scsidevs -a

# esxcfg-scsidevs -ld naa.5000c50062b52a23

# esxcfg-mpath -ld naa.5000c50062b52a23

the type of controller/disks could give some insight on whats going on. the build is 5.5U2/U3 right ?

Thanks,

itinferv · ‎04-01-2015

[root@localhost:~] esxcfg-scsidevs -a

vmhba0 ata_piix link-n/a sata.vmhba0 (0000:00:1f.2) Intel Corporation ICH10 4 port SATA IDE Controller

vmhba1 ata_piix link-n/a sata.vmhba1 (0000:00:1f.5) Intel Corporation ICH10 2 port SATA IDE Controller

vmhba2 aacraid link-n/a pscsi.vmhba2 (0000:05:00.0) Adaptec AACRAID

vmhba32 usb-storage link-n/a usb.vmhba32 () USB

vmhba33 ata_piix link-n/a sata.vmhba33 (0000:00:1f.2) Intel Corporation ICH10 4 port SATA IDE Controller

vmhba34 ata_piix link-n/a sata.vmhba34 (0000:00:1f.5) Intel Corporation ICH10 2 port SATA IDE Controller

[root@localhost:~] esxcfg-scsidevs -ld naa.55cd2e404b6625ce

naa.55cd2e404b6625ce

Device Type: Direct-Access

Size: 190590 MB

Display Name: Local INTEL Disk (naa.55cd2e404b6625ce)

Multipath Plugin: NMP

Console Device: /vmfs/devices/disks/naa.55cd2e404b6625ce

Devfs Path: /vmfs/devices/disks/naa.55cd2e404b6625ce

Vendor: INTEL Model: SSDSC2BA200G3 Revis: 5DV1

SCSI Level: 5 Is Pseudo: false Status: on

Is RDM Capable: false Is Removable: false

Is Local: true Is SSD: true

Other Names:

vml.020000000055cd2e404b6625ce535344534332

VAAI Status: unsupported

[root@localhost:~] esxcfg-mpath -ld naa.55cd2e404b6625ce

pscsi.vmhba2-pscsi.1:3-naa.55cd2e404b6625ce

Runtime Name: vmhba2:C1:T3:L0

Device: naa.55cd2e404b6625ce

Device Display Name: Local INTEL Disk (naa.55cd2e404b6625ce)

Adapter: vmhba2 Channel: 1 Target: 3 LUN: 0

Adapter Identifier: pscsi.vmhba2

Target Identifier: pscsi.1:3

Plugin: NMP

State: active

Transport: parallel

sengland · ‎04-01-2015

The build is 6.0.0

sengland · ‎04-01-2015

We cant even create a normal datastore on these SSD's

Call "HostDatastoreSystem.CreateVmfsDatastore" for object "datastoreSystem-54" on vCenter Server "vcenter" failed.

Operation failed, diagnostics report: Unable to create Filesystem, please see VMkernel log for more details: ATS on device /dev/disks/naa.55cd2e404b6625ce:1: not supported

This happens across three hosts, all three hosts were fine before upgrading to ESX 6.0 but since upgrading the SSDs have started to display this behavior. If we mark a HDD as SSD we can bring up vSAN, the HDD's are connected to the same RAID card as the SSDs and we see this same behaviour across all three hosts.

See this in dmesg:

2015-04-01T21:58:49.867Z cpu14:32821)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.55cd2e404b6625ce" state in doubt; requested fast path state update...

2015-04-01T21:58:49.867Z cpu14:32821)ScsiDeviceIO: 2646: Cmd(0x43a5804eb940) 0x16, CmdSN 0x3854 from world 0 to dev "naa.55cd2e404b6625ce" failed H:0x7 D:0x2 P:0x0 Possible sense data: 0x5 0x20 0x0.

2015-04-01T21:58:49.987Z cpu18:35109)NMP: nmp_PathDetermineFailure:2901: SCSI cmd RESERVE failed on path vmhba2:C1:T3:L0, reservation state on device naa.55cd2e404b6625ce is unknown.

2015-04-01T21:58:50.105Z cpu15:35110)NMP: nmp_PathDetermineFailure:2901: SCSI cmd RESERVE failed on path vmhba2:C1:T3:L0, reservation state on device naa.55cd2e404b6625ce is unknown.

2015-04-01T21:58:50.140Z cpu17:35072 opID=517541e1)LVM: 9274: LVMProbeDevice failed on (3468846912, naa.55cd2e404b6625ce:1): Device does not contain a logical volume

2015-04-01T21:58:56.499Z cpu19:35902)FSS: 5327: No FS driver claimed device 'control': No filesystem on the device

2015-04-01T21:58:56.829Z cpu19:35902)FSS: 5327: No FS driver claimed device 'naa.55cd2e404b6625ce:1': No filesystem on the device

2015-04-01T21:58:56.829Z cpu19:35902)VC: 3551: Device rescan time 206 msec (total number of devices 5)

2015-04-01T21:58:56.829Z cpu19:35902)VC: 3554: Filesystem probe time 352 msec (devices probed 5 of 5)

2015-04-01T21:58:56.829Z cpu19:35902)VC: 3556: Refresh open volume time 0 msec

2015-04-01T21:59:46.596Z cpu2:33324)NMP: nmp_ResetDeviceLogThrottling:3345: last error status from device naa.55cd2e404b6625ce repeated 2 times

2015-04-01T21:59:46.596Z cpu2:33324)NMP: nmp_ResetDeviceLogThrottling:3345: last error status from device mpx.vmhba32:C0:T0:L0 repeated 29 times

2015-04-01T22:01:01.646Z cpu14:32821)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x9e (0x43a58046cbc0, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE

2015-04-01T22:01:46.596Z cpu4:33324)NMP: nmp_ResetDeviceLogThrottling:3345: last error status from device mpx.vmhba32:C0:T0:L0 repeated 1 times

ramakrishnak · ‎04-02-2015

Thanks for the cmd output

this output explains whats happening.

> vmhba2 aacraid link-n/a pscsi.vmhba2 (0000:05:00.0) Adaptec AACRAID

> Adapter Identifier: pscsi.vmhba2

> Target Identifier: pscsi.1:3 <=====

> Plugin: NMP

> State: active

> Transport: parallel <=====

Also if i remember correctly this is not supported in vsphere either for PSCSI-RAID

( http://www.vmware.com/resources/compatibility/search.php?deviceCategory=io )

aacraid is a scsi driver and it does not have SAS transport capability. Thats why its transport type is listed as PSCSI.

for VSAN we require SAS/SAS-SATA RAID (you can see the complete list in VSAN HCL)

malefik · ‎07-01-2015

Do not use this RAID-controller mode JBOD. Create for each disk array level 0 and vSAN will work.

JeremeyWise · ‎02-14-2022

I noted RAID .. but it is JBOD controller.

But what I did find .. is that if I create VMFS file system.. copy over 10GB or so (maybe just partition was sufficient but I did a copy to test health of disk) I could then delete the datastore... and then vSAN would allow consumption.

vSAN back to ... well. next step to setup again.

Nerd needing coffee

All

Failure to create disk group - Failed to reserve disk with exception: Failed to reserve disk with exception: Reserve failed with error code: -1