Hello, We are testing NVMEof with esxi 7. I am having issues getting the device to be recognized. I am using mellanox connectx-4 cards. I am attempting to access a nvme device as a test. I am able to discover the controller in the vmware interface.
The namespace tab also shows the correct disk size and name. 750gb in this case
on the paths tab the following shows up:
Runtime Name: vmhba67:C0:T1:L0
Target: Blank
Lun: 0
Status: Dead
Below is the test config from the linux server. Anyone have any suggestions for next steps for troubleshooting? /dev/nvme0n1 is a freshly erased nvme drive.
modprobe nvmet
modprobe nvmet-rdma
sudo /bin/mount -t configfs none /sys/kernel/config/
sudo mkdir /sys/kernel/config/nvmet/subsystems/PSC
cd /sys/kernel/config/nvmet/subsystems/PSC
echo 1 | sudo tee -a attr_allow_any_host > /dev/null
sudo mkdir namespaces/1
cd namespaces/1/
echo -n /dev/nvme0n1> device_path
echo 1 | sudo tee -a enable > /dev/null
sudo mkdir /sys/kernel/config/nvmet/ports/1
cd /sys/kernel/config/nvmet/ports/1
echo 10.10.11.1 | sudo tee -a addr_traddr > /dev/null
echo rdma | sudo tee -a addr_trtype > /dev/null
echo 4420 | sudo tee -a addr_trsvcid > /dev/null
echo ipv4 | sudo tee -a addr_adrfam > /dev/null
sudo ln -s /sys/kernel/config/nvmet/subsystems/PSC/ /sys/kernel/config/nvmet/ports/1/subsystems/PSC
sudo mkdir /sys/kernel/config/nvmet/ports/2
cd /sys/kernel/config/nvmet/ports/2
echo 10.10.12.1 | sudo tee -a addr_traddr > /dev/null
echo rdma | sudo tee -a addr_trtype > /dev/null
echo 4420 | sudo tee -a addr_trsvcid > /dev/null
echo ipv4 | sudo tee -a addr_adrfam > /dev/null
sudo ln -s /sys/kernel/config/nvmet/subsystems/PSC/ /sys/kernel/config/nvmet/ports/2/subsystems/PSC
just a follow up I was able to add it to another linux system without issue. Is there somewhere on the esxi host i can check logs? Is there possibly something wrong with my subnqn? Most vendor appliances have long winded names. The linux server accepted PSC but perhaps vmware cant.
Found this in the logs, it seems HPP doesnt support the device for some reason (its a mellanox connectx-4 adapter back to a linux target). Perhaps they dont support the linux target, or perhaps i simply need to do a better job of naming my linux target nqn.
2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)WARNING: HPP: HppClaimPath:3719: Failed to claim path 'vmhba67:C0:T2:L0': Not supported
2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)HPP: HppUnclaimPath:3765: Unclaiming path vmhba67:C0:T2:L0
2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)ScsiPath: 8397: Plugin 'HPP' rejected path 'vmhba67:C0:T2:L0'
2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)ScsiClaimrule: 1568: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba67:C0:T2:L0: Not supported
2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)WARNING: ScsiPath: 8327: NMP cannot claim a path to NVMeOF device vmhba67:C0:T2:L0
2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)ScsiClaimrule: 1568: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba67:C0:T2:L0: Not supported
2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)ScsiClaimrule: 1872: Error claiming path vmhba67:C0:T2:L0. Not supported.
2020-04-24T04:01:52.809Z cpu9:2099749 opID=6441911)WARNING: HPP: HppClaimPath:3719: Failed to claim path 'vmhba67:C0:T2:L0': Not supported
2020-04-24T04:01:52.809Z cpu9:2099749 opID=6441911)HPP: HppUnclaimPath:3765: Unclaiming path vmhba67:C0:T2:L0
hello, we also have this issue. Do you resolve this issue?
our storage target map to esxi with fc-nvme, we can find the nvme controller and namespace, but can't find storage device.
1. find nvme controller and namespace.
_______________
[root@localhost:~] esxcli nvme fabrics discover -a vmhba68 -W 0x56c92bf803002760 -w 0x56c92bf8033b2760
Transport Type Address Family Subsystem Type Controller ID Admin Queue Max Size Transport Address Transport Service ID Subsystem NQN Connected
-------------- -------------- -------------- ------------- -------------------- ------------------------------------------- -------------------- ----------------------------------- ---------
FC Fibre Channel NVM 65535 32 nn-0x56c92bf803002760:pn-0x56c92bf8033b2760 none nqn.2004-12.com.inspur:mcs.28827034 true
[root@localhost:~] [root@localhost:~] esxcli nvme fabrics discover -a vmhba68 -W 0x56c92bf803002760 -w 0x56c92bf8033b2760 [root@localhost:~] esxcli nvme controller list
Name Controller Number Adapter Transport Type Is Online
----------------------------------------------------------------------------- ----------------- ------- -------------- ---------
nqn.2004-12.com.inspur:mcs.28827034#vmhba68#56c92bf803002760:56c92bf8033b2760 467 vmhba68 FC true
[root@localhost:~] [root@localhost:~] esxcli nvme controller list list list list list list list list list list listn lista listm liste lists listp lista listc liste list list
Name Controller Number Namespace ID Block Size Capacity in MB
------------------------------------ ----------------- ------------ ---------- --------------
eui.d000000000000001005076000a209c06 467 2 512 10240
_______________
2. "esxcli storage core path list" command show the path is dead," esxcli storage core device list" can.t find storage device
_______________
fc.200000109bc18a3f:100000109bc18a3f-fc.56c92bf803002760:56c92bf8033b2760-
UID: fc.200000109bc18a3f:100000109bc18a3f-fc.56c92bf803002760:56c92bf8033b2760-
Runtime Name: vmhba68:C0:T3:L1
Device: No associated device
Device Display Name: No associated device
Adapter: vmhba68
Channel: 0
Target: 3
LUN: 1
Plugin: (unclaimed)
State: dead
Transport: fc
Adapter Identifier: fc.200000109bc18a3f:100000109bc18a3f
Target Identifier: fc.56c92bf803002760:56c92bf8033b2760
Adapter Transport Details: Unavailable or path is unclaimed
Target Transport Details: Unavailable or path is unclaimed
Maximum IO Size: 2097152
_______________
3. some err log
Warring: HPP: HppClaimPath:3719: Failed to claim path ‘vmhba68:C0:T0:L1’: Not supported
_______________
see attachment
_______________
I have the same issue, with Emulex LPe32000 PCi Fibre channel adapter and ESX 7.0.1
I think issue is due to bad ClaimRule.
I've tried add a new one but it's not working.
I've feeling ESX is not prepared for NVME storages in default
we also use LPe32000 PCi FC adapter and ESXi 7.0.0. we have this issue。
We use IBM storage to test, it can work fine.
Issue was solved by Storage vendor. Released a new firmware, downgrade 4k to 512 volume block size supported by VMware.
Instead of vSphere 7U2 support 4k device, external storage like NetApp EF600 with 4k volume is not visible in vSphere, only 512 block size volume.
Controllers, Namespaces, Paths all are OK, but 4k device/volume is not shown in vSphere. Not supported?
Peter
I configured a nvme block device using 'nvmetcli' with block size: 512 bytes on a Centos VM and then did an nvme connect[NVMe/TCP] to that target. ESXi is able to see the volume and it's also listed in 'esxcli nvme namespace list'. But path to the target is shown DEAD. Any pointers as to why the path is DEAD?
esxcli nvme namespace list
Name Controller Number Namespace ID Block Size Capacity in MB
------------------------------------- ----------------- ------------ ---------- --------------
eui.343337304d1007610025384500000001 256 1 512 915715
eui.343337304d1015200025384500000001 257 1 512 915715
uuid.b8bbea9b8b34471b97b13222a954e43e 328 1 512 20480 <<<<
esxcli nvme controller list
Name Controller Number Adapter Transport Type Is Online
-------------------------------------------------------------------------------------- ----------------- ------- -------------- ---------
nqn.2014-08.org.nvmexpress_144d_SAMSUNG_MZQLB960HAJR-00007______________S437NE0M100761 256 vmhba2 PCIe true
nqn.2014-08.org.nvmexpress_144d_SAMSUNG_MZQLB960HAJR-00007______________S437NE0M101520 257 vmhba3 PCIe true
testnqn#vmhba65#15.33.8.5:4420 328 vmhba65 TCP true
esxcli storage core path list -p vmhba65:C0:T0:L0
tcp.vmnic5:3c:fd:fe:c3:93:5d-tcp.unknown-
UID: tcp.vmnic5:3c:fd:fe:c3:93:5d-tcp.unknown-
Runtime Name: vmhba65:C0:T0:L0
Device: No associated device
Device Display Name: No associated device
Adapter: vmhba65
Channel: 0
Target: 0
LUN: 0
Plugin: (unclaimed)
State: dead <<<<<<<<<<<<<<<<<<<<<<
Transport: tcp
Adapter Identifier: tcp.vmnic5:3c:fd:fe:c3:93:5d
Target Identifier: tcp.unknown
Adapter Transport Details: Unavailable or path is unclaimed
Target Transport Details: Unavailable or path is unclaimed
Maximum IO Size: 1048576
VMkernel log:
===========
2022-01-21T02:44:37.120Z cpu22:1048893)HPP: HppCreateDevice:3071: Created logical device 'uuid.b8bbea9b8b34471b97b13222a954e43e'.
2022-01-21T02:44:37.120Z cpu22:1048893)WARNING: HPP: HppClaimPath:3956: Failed to claim path 'vmhba65:C0:T0:L0': Not supported
2022-01-21T02:44:37.120Z cpu22:1048893)HPP: HppUnclaimPath:4002: Unclaiming path vmhba65:C0:T0:L0
2022-01-21T02:44:37.120Z cpu22:1048893)ScsiPath: 8597: Plugin 'HPP' rejected path 'vmhba65:C0:T0:L0'
2022-01-21T02:44:37.120Z cpu22:1048893)ScsiClaimrule: 2039: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-01-21T02:44:37.121Z cpu22:1048893)WARNING: ScsiPath: 8496: NMP cannot claim a path to NVMeOF device vmhba65:C0:T0:L0
2022-01-21T02:44:37.121Z cpu22:1048893)ScsiClaimrule: 2039: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-01-21T02:44:37.121Z cpu22:1048893)ScsiClaimrule: 2518: Error claiming path vmhba65:C0:T0:L0. Not supported.
I configured a nvme block device on Centos using 'nvmetcli' and did an nvme connect from ESXi 7. Though nvme connect was successful and the namespace was listed in 'esxcli nvme namespace list', the path is the namespace was reported DEAD. Any pointers as to why the path was reported DEAD?
vmkernel log:
==========
022-01-21T02:44:37.120Z cpu22:1048893)HPP: HppCreateDevice:3071: Created logical device 'uuid.b8bbea9b8b34471b97b13222a954e43e'.
2022-01-21T02:44:37.120Z cpu22:1048893)WARNING: HPP: HppClaimPath:3956: Failed to claim path 'vmhba65:C0:T0:L0': Not supported
2022-01-21T02:44:37.120Z cpu22:1048893)HPP: HppUnclaimPath:4002: Unclaiming path vmhba65:C0:T0:L0
2022-01-21T02:44:37.120Z cpu22:1048893)ScsiPath: 8597: Plugin 'HPP' rejected path 'vmhba65:C0:T0:L0'
2022-01-21T02:44:37.120Z cpu22:1048893)ScsiClaimrule: 2039: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-01-21T02:44:37.121Z cpu22:1048893)WARNING: ScsiPath: 8496: NMP cannot claim a path to NVMeOF device vmhba65:C0:T0:L0
2022-01-21T02:44:37.121Z cpu22:1048893)ScsiClaimrule: 2039: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-01-21T02:44:37.121Z cpu22:1048893)ScsiClaimrule: 2518: Error claiming path vmhba65:C0:T0:L0. Not supported.
esxcli storage core path list -p vmhba65:C0:T0:L0
tcp.vmnic5:3c:fd:fe:c3:93:5d-tcp.unknown-
UID: tcp.vmnic5:3c:fd:fe:c3:93:5d-tcp.unknown-
Runtime Name: vmhba65:C0:T0:L0
Device: No associated device
Device Display Name: No associated device
Adapter: vmhba65
Channel: 0
Target: 0
LUN: 0
Plugin: (unclaimed)
State: dead
Transport: tcp
Adapter Identifier: tcp.vmnic5:3c:fd:fe:c3:93:5d
Target Identifier: tcp.unknown
Adapter Transport Details: Unavailable or path is unclaimed
Target Transport Details: Unavailable or path is unclaimed
Maximum IO Size: 1048576
esxcli nvme controller list
Name Controller Number Adapter Transport Type Is Online
-------------------------------------------------------------------------------------- ----------------- ------- -------------- ---------
nqn.2014-08.org.nvmexpress_144d_SAMSUNG_MZQLB960HAJR-00007______________S437NE0M100761 256 vmhba2 PCIe true
nqn.2014-08.org.nvmexpress_144d_SAMSUNG_MZQLB960HAJR-00007______________S437NE0M101520 257 vmhba3 PCIe true
testnqn#vmhba65#15.33.8.5:4420 328 vmhba65 TCP true
esxcli nvme namespace list
Name Controller Number Namespace ID Block Size Capacity in MB
------------------------------------- ----------------- ------------ ---------- --------------
eui.343337304d1007610025384500000001 256 1 512 915715
eui.343337304d1015200025384500000001 257 1 512 915715
uuid.b8bbea9b8b34471b97b13222a954e43e 328 1 512 20480
I hit the issue with linux nvmet + esxi host.
Does someone know how to dig out how HPP rejects it as failed to claim path?
I would like to know which doc describes how to debug with HPP.
If you know that, please leave some information about that for me.
Thanks~
See below link. VMware expects some functionality which is not in Linux Kernel 5 and so it cant work.
We are also trying to get that working and i found this link:
https://koutoupis.com/2022/04/22/vmware-lightbits-labs-and-nvme-over-tcp/