itinferv
Contributor
Contributor

Failure to create disk group - Failed to reserve disk with exception: Failed to reserve disk with exception: Reserve failed with error code: -1

Hi I'm receiving the following error when attempting to create a disk group in vSAN :- Failed to reserve disk naa.55cd2e404b66fd08 with exception: Failed to reserve disk naa.55cd2e404b66fd08 with exception: Reserve failed with error code: -1. The disk in question is my SSD. Disks are all eligible according to vSAN.

0 Kudos
17 Replies
CHogan
VMware Employee
VMware Employee

Let me guess - you are running nested VSAN on top of VSAN?

If so, you need fake SCSI reservations - check this William Lam post out - http://www.virtuallyghetto.com/2013/11/how-to-run-nested-esxi-on-top-of-vsan.html

http://cormachogan.com
0 Kudos
itinferv
Contributor
Contributor

nope, it's a physical server. with physical disks

0 Kudos
CHogan
VMware Employee
VMware Employee

Dunno the cause then I'm afraid. I only ever saw this with nested environments.

http://cormachogan.com
0 Kudos
ramakrishnak
VMware Employee
VMware Employee

yes, same here

Only seen in Virtual Environments.

0 Kudos
ramakrishnak
VMware Employee
VMware Employee

Should not be the case though, but can you check to see if there is any stale partition exists on this SSD disk and cleanup before re-attempting to create the diskgroup. 


# partedUtil getptbl /vmfs/devices/disks/<naa.xxxxxx>

and also other thing to look for is

# esxcli vsan storage list --device=<naa.xxxxxx>

Thanks,

0 Kudos
itinferv
Contributor
Contributor

partedUtil reports no partitions on the disk.

The output of # esxcli vsan storage list --device=<naa.xxxxxx> was this -

[root@localhost:~] esxcli vsan storage list --device+ /vmfs/devices/disks/naa.55cd2e404b66fd08

Unable to find device with name: /vmfs/devices/disks/naa.55cd2e404b66fd08

0 Kudos
ramakrishnak
VMware Employee
VMware Employee

Thanks,

do not see any issue in here...   sorry couldn't be of much help

one other thing would check is

# esxcfg-scsidevs -a   (gives controller/disk drive info)

# esxcfg-scsidevs -c   ( device info)

# esxcfg-scsidevs -ld <device>   ( device parameters)

and if disks/controller are in vmware HCL would suggest to file a defect with support logs, so we could analyze whats going on ?

0 Kudos
zdickinson
Expert
Expert

We had a similar issue and it was because the drive was MBR and we had to change it to GPT.  My apologies, I cannot find the link for instructions.  Thank you, Zach.

0 Kudos
ramakrishnak
VMware Employee
VMware Employee

hmm

k, if this is the case then the instruction on changing is easier

first check

# partedUtil getptbl /vmfs/devices/disks/<naa.xxxxxx>

to see what it reports ie "gpt" or "msdos"


if its later, then you have to clear the partition table entries from the device. you can do this by dd'ng out zeros to the the **first** and **last** 34 sectors of the device

this will get you an unused disk and you should be good to go.

PS: you have to clear both first and last 34 sectors if you have GPT header, since backup partition is stored on last 34 sectors


eg:

partedUtil getptbl /vmfs/devices/disks/naa.60060160153020006c952ce0dbcde011

msdos  <====

1305 255 63 20971520


dd if=/dev/zero of=/vmfs/devices/disks/naa.60000970000194900650533030453034 count=34 conv=notrunc

34+0 records in

34+0 records out

dd if=/dev/zero of=/vmfs/devices/disks/naa.60000970000194900650533030453034 count=34 conv=notrunc seek=20972126

34+0 records in

34+0 records out

partedUtil getptbl /vmfs/devices/disks/naa.60000970000194900650533030453034

unknown <====

1305 255 63 20972160


Thanks,




0 Kudos
sengland
Contributor
Contributor

We have zero'd the first and last 34sectors and all the partitions (2 X magnetic + 1 x SSD) all show as unknown.

We receive the same "failed to reserve" error when trying to claim the disks.

[root@localhost:~] partedUtil getptbl /vmfs/devices/disks/naa.5000c50062e1f8ef

unknown

121533 255 63 1952428032

[root@localhost:~] partedUtil getptbl /vmfs/devices/disks/naa.5000c50062b52a23

unknown

121533 255 63 1952428032

[root@localhost:~] partedUtil getptbl /vmfs/devices/disks/naa.55cd2e404b66fd08

unknown

24296 255 63 390328320

A general system error occurred: Failed to reserve disk naa.55cd2e404b66fd08 with exception: Failed to reserve disk naa.55cd2e404b66fd08 with exception: Reserve failed with error code: -1

0 Kudos
ramakrishnak
VMware Employee
VMware Employee

hmm,

can you post the

# esxcfg-scsidevs -a

# esxcfg-scsidevs -ld naa.5000c50062b52a23

# esxcfg-mpath -ld naa.5000c50062b52a23

the type of controller/disks could give some insight on whats going on. the build is 5.5U2/U3 right ?

Thanks,

0 Kudos
itinferv
Contributor
Contributor

[root@localhost:~] esxcfg-scsidevs -a

vmhba0  ata_piix          link-n/a  sata.vmhba0                             (0000:00:1f.2) Intel Corporation ICH10 4 port SATA IDE Controller

vmhba1  ata_piix          link-n/a  sata.vmhba1                             (0000:00:1f.5) Intel Corporation ICH10 2 port SATA IDE Controller

vmhba2  aacraid           link-n/a  pscsi.vmhba2                            (0000:05:00.0) Adaptec AACRAID

vmhba32 usb-storage       link-n/a  usb.vmhba32                             () USB

vmhba33 ata_piix          link-n/a  sata.vmhba33                            (0000:00:1f.2) Intel Corporation ICH10 4 port SATA IDE Controller

vmhba34 ata_piix          link-n/a  sata.vmhba34                            (0000:00:1f.5) Intel Corporation ICH10 2 port SATA IDE Controller

[root@localhost:~] esxcfg-scsidevs -ld naa.55cd2e404b6625ce

naa.55cd2e404b6625ce

   Device Type: Direct-Access

   Size: 190590 MB

   Display Name: Local INTEL Disk (naa.55cd2e404b6625ce)

   Multipath Plugin: NMP

   Console Device: /vmfs/devices/disks/naa.55cd2e404b6625ce

   Devfs Path: /vmfs/devices/disks/naa.55cd2e404b6625ce

   Vendor: INTEL     Model: SSDSC2BA200G3     Revis: 5DV1

   SCSI Level: 5  Is Pseudo: false Status: on

   Is RDM Capable: false Is Removable: false

   Is Local: true  Is SSD: true

   Other Names:

      vml.020000000055cd2e404b6625ce535344534332

   VAAI Status: unsupported

[root@localhost:~] esxcfg-mpath -ld naa.55cd2e404b6625ce

pscsi.vmhba2-pscsi.1:3-naa.55cd2e404b6625ce

   Runtime Name: vmhba2:C1:T3:L0

   Device: naa.55cd2e404b6625ce

   Device Display Name: Local INTEL Disk (naa.55cd2e404b6625ce)

   Adapter: vmhba2 Channel: 1 Target: 3 LUN: 0

   Adapter Identifier: pscsi.vmhba2

   Target Identifier: pscsi.1:3

   Plugin: NMP

   State: active

   Transport: parallel

0 Kudos
sengland
Contributor
Contributor

The build is 6.0.0

0 Kudos
sengland
Contributor
Contributor

We cant even create a normal datastore on these SSD's

Call "HostDatastoreSystem.CreateVmfsDatastore" for object "datastoreSystem-54" on vCenter Server "vcenter" failed.

Operation failed, diagnostics report:  Unable to create Filesystem, please see VMkernel log for more details: ATS on device /dev/disks/naa.55cd2e404b6625ce:1: not supported

This happens across three hosts, all three hosts were fine before upgrading to ESX 6.0 but since upgrading the SSDs have started to display this behavior. If we mark a HDD as SSD we can bring up vSAN, the HDD's are connected to the same RAID card as the SSDs and we see this same behaviour across all three hosts.

See this in dmesg:

2015-04-01T21:58:49.867Z cpu14:32821)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.55cd2e404b6625ce" state in doubt; requested fast path state update...

2015-04-01T21:58:49.867Z cpu14:32821)ScsiDeviceIO: 2646: Cmd(0x43a5804eb940) 0x16, CmdSN 0x3854 from world 0 to dev "naa.55cd2e404b6625ce" failed H:0x7 D:0x2 P:0x0 Possible sense data: 0x5 0x20 0x0.

2015-04-01T21:58:49.987Z cpu18:35109)NMP: nmp_PathDetermineFailure:2901: SCSI cmd RESERVE failed on path vmhba2:C1:T3:L0, reservation state on device naa.55cd2e404b6625ce is unknown.

2015-04-01T21:58:50.105Z cpu15:35110)NMP: nmp_PathDetermineFailure:2901: SCSI cmd RESERVE failed on path vmhba2:C1:T3:L0, reservation state on device naa.55cd2e404b6625ce is unknown.

2015-04-01T21:58:50.140Z cpu17:35072 opID=517541e1)LVM: 9274: LVMProbeDevice failed on (3468846912, naa.55cd2e404b6625ce:1): Device does not contain a logical volume

2015-04-01T21:58:56.499Z cpu19:35902)FSS: 5327: No FS driver claimed device 'control': No filesystem on the device

2015-04-01T21:58:56.829Z cpu19:35902)FSS: 5327: No FS driver claimed device 'naa.55cd2e404b6625ce:1': No filesystem on the device

2015-04-01T21:58:56.829Z cpu19:35902)VC: 3551: Device rescan time 206 msec (total number of devices 5)

2015-04-01T21:58:56.829Z cpu19:35902)VC: 3554: Filesystem probe time 352 msec (devices probed 5 of 5)

2015-04-01T21:58:56.829Z cpu19:35902)VC: 3556: Refresh open volume time 0 msec

2015-04-01T21:59:46.596Z cpu2:33324)NMP: nmp_ResetDeviceLogThrottling:3345: last error status from device naa.55cd2e404b6625ce repeated 2 times

2015-04-01T21:59:46.596Z cpu2:33324)NMP: nmp_ResetDeviceLogThrottling:3345: last error status from device mpx.vmhba32:C0:T0:L0 repeated 29 times

2015-04-01T22:01:01.646Z cpu14:32821)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x9e (0x43a58046cbc0, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE

2015-04-01T22:01:46.596Z cpu4:33324)NMP: nmp_ResetDeviceLogThrottling:3345: last error status from device mpx.vmhba32:C0:T0:L0 repeated 1 times

0 Kudos
ramakrishnak
VMware Employee
VMware Employee

Thanks for the cmd output

this output explains whats happening.

> vmhba2  aacraid           link-n/a  pscsi.vmhba2                            (0000:05:00.0) Adaptec AACRAID

>   Adapter Identifier: pscsi.vmhba2

>  Target Identifier: pscsi.1:3  <=====

>   Plugin: NMP

>   State: active

>   Transport: parallel  <=====

Also if i remember correctly this is not supported in vsphere either for PSCSI-RAID

( http://www.vmware.com/resources/compatibility/search.php?deviceCategory=io )

aacraid is a scsi driver and it does not have SAS transport capability. Thats why its transport type is listed as PSCSI.

for VSAN we require SAS/SAS-SATA RAID (you can see the complete list in VSAN HCL)



0 Kudos
malefik
Enthusiast
Enthusiast

Do not use this RAID-controller mode JBOD. Create for each disk array level 0 and vSAN will work.

0 Kudos
JeremeyWise
Enthusiast
Enthusiast

I noted RAID .. but it is JBOD controller.

 

 

But what I did find .. is that if I create  VMFS file system.. copy over 10GB or so (maybe just partition was sufficient but I did a copy to test health of disk)   I could then delete the datastore... and then vSAN would allow consumption.

vSAN back to ... well. next step to setup again.

 

JeremeyWise_0-1644869136663.png

 

 

 


Nerd needing coffee
0 Kudos