continuum
Immortal
Immortal

NVME-disks in 15.5.1 are unstable in nested ESXi-VMs

Since ESXi is a more or less supported guestOS I never had any problems with unstable virtual SCSI-disks.

Cant they the same about virtual NVME-disks.

Looks like they are less stable than SCSI-disks.

So if you consider testing NVME disks in nested ESXi VMs better use SCSI-disks if you want stable datastores.

I noticed this a couple of times recently that NVME based datastores come and go.

Neither the nested ESXis (6.0, 6.7 and higher) nor Workstation reacted with error messages.

The datastores became unavailable but the t10............. device node disappeared from /dev/disks.

Reactivating them after the datastores disappear did not work from the ESXi side.

But they usuallyy come back when the ESXi gets completely powered off once.

A reboot alone seems to work less often.


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
2 Replies
dariusd
Leadership
Leadership

Have you noticed any particular sequence of operations which triggers the problem?

Is there nothing relevant at all logged in the ESXi VM's /var/run/log/vmkwarning.log or any of the other logfiles alongside it at the time that the NVMe datastore disappears?

Thanks,

--

Darius

0 Kudos
continuum
Immortal
Immortal

This is something i just found ...

2020-02-07T04:18:17.191Z cpu1:66042)ScsiDeviceIO: 9671: Could not detect setting of sitpua for device t10.NVMe____VMware_Virtual_NVMe_Disk________________VMWare_NVME_0000____00000002. Error Not supported.

2020-02-07T04:18:17.191Z cpu0:66041)ScsiDeviceIO: 9174: QErr is correctly set to 0x0 for device t10.NVMe____VMware_Virtual_NVMe_Disk________________VMWare_NVME_0000____00000001.

2020-02-07T04:18:17.191Z cpu0:66041)WARNING: NvmeScsi: 149: SCSI opcode 0x1a (0x451e429fe1c0) on path vmhba0:C0:T0:L0 to namespace t10.NVMe____VMware_Virtual_NVMe_Disk________________VMWare_NVME_0000____00000001 failed with NVMe error

2020-02-07T04:18:17.191Z cpu0:66041)WARNING: status: 0x2 translating to SCSI error H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0

2020-02-07T04:18:17.191Z cpu0:66041)ScsiDeviceIO: 9671: Could not detect setting of sitpua for device t10.NVMe____VMware_Virtual_NVMe_Disk________________VMWare_NVME_0000____00000001. Error Not supported.

2020-02-07T04:18:17.192Z cpu0:66039)ScsiUid: 321: Path 'vmhba64:C0:T0:L0' does not support VPD Device Id page.

2020-02-07T04:18:17.192Z cpu0:66039)VMWARE SCSI Id: Could not get disk id for vmhba64:C0:T0:L0

2020-02-07T04:18:17.192Z cpu0:66039)StorageApdHandler: 976: APD Handle  Created with lock[StorageApd-0x4302d70adef0]

2020-02-07T04:18:17.192Z cpu0:66039)ScsiEvents: 508: Event Subsystem: Device Events, Created!

2020-02-07T04:18:17.192Z cpu0:66039)ScsiEvents: 508: Event Subsystem: Device Events - Internal, Created!

2020-02-07T04:18:17.192Z cpu1:66042)NFS: 1358: Invalid volume UUID t10.NVMe____VMware_Virtual_NVMe_Disk________________VMWare_NVME_0000____00000002:1

2020-02-07T04:18:17.192Z cpu1:66042)FSS: 6298: No FS driver claimed device 't10.NVMe____VMware_Virtual_NVMe_Disk________________VMWare_NVME_0000____00000002:1': No filesystem on the device

2020-02-07T04:18:17.192Z cpu1:66042)ScsiEvents: 301: EventSubsystem: Device Events, Event Mask: 40, Parameter: 0x4302d7097a00, Registered!

The redline is something I would find alarming if I found it while doing recovery work ...

But so far I never spend ,ich time on this issue.

I will watch out when it happens next time.

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos