Hi fellows !
I had an issue next to ESXi update (HP Custom ESXi 5.5.0 U3 Build 3568722) on a HP ProLiant DL360 G7, freshly updated with the last Service Pack SPP , it works pretty well except the boot time, he takes around 30min to boot because it's stuck to that kind of log for each device (+multipathing), like he scan all
2016-11-30T06:51:17.793Z cpu4:33206)ScsiDeviceIO: 7499: Get VPD 86 Inquiry for device "naa.50002ac0de009893" from Plugin "NMP" failed. Not supported
2016-11-30T06:51:17.796Z cpu4:33206)ScsiDeviceIO: 6202: Could not detect setting of QErr for device naa.50002ac0de009893. Error SCSI reservation conflict.
2016-11-30T06:51:17.796Z cpu4:33206)ScsiDeviceIO: 6716: Could not detect setting of sitpua for device naa.50002ac0de009893. Error SCSI reservation conflict.
2016-11-30T06:51:17.798Z cpu4:33206)ScsiDevice: 3445: Successfully registered device "naa.50002ac0de009893" from plugin "NMP" of type 0
2016-11-30T06:51:17.798Z cpu4:33206)ScsiEvents: 301: EventSubsystem: Device Events, Event Mask: 180, Parameter: 0x41098399cf30, Registered!
2016-11-30T06:51:17.802Z cpu4:33206)StorageApdHandler: 698: APD Handle Created with lock[StorageApd0x4108a6]
2016-11-30T06:51:17.802Z cpu4:33206)ScsiEvents: 501: Event Subsystem: Device Events, Created!
2016-11-30T06:51:17.803Z cpu4:33206)VMWARE SCSI Id: Id for vmhba2:C0:T0:L27
0x50 0x00 0x2a 0xc0 0xdc 0x00 0x98 0x93 0x56 0x56 0x20 0x20 0x20 0x20
2016-11-30T06:51:17.805Z cpu4:33206)VMWARE SCSI Id: Id for vmhba1:C0:T1:L27
I know some people had this problem in the past, there is no really solutions that have been brought but i also observed a strange things , in esx.conf, some devices are discovered with VMW_PSP_RR (thanks to a user custom rule) which is ok and that's what i want, but the most use VMW_PSP_MRU :
/storage/plugin/NMP/device[naa.60002ac0000000000000000d000098a4]/psp = "VMW_PSP_RR"
/storage/plugin/NMP/device[naa.50002ac0620098a4]/psp = "VMW_PSP_MRU"
/storage/plugin/NMP/device[naa.50002ac145009893]/psp = "VMW_PSP_MRU"
/storage/plugin/NMP/device[naa.50002ac145009893]/preferred = "fc.20000000c9691329:10000000c9691329-fc.2ff70002ac009893:20120002ac009893-naa.50002ac145009893"
/storage/plugin/NMP/device[naa.50002ac0e7009893]/psp = "VMW_PSP_MRU"
What can i do to exclude the VMW_PSP_MRU options from all my devices ? and/or How can i handle the Plugin "NMP" failed ?
Do someone have an advice ?
Regards,
Gael
Please follow below steps , it is working one . Just cross checked in my Lab
[root@ESXi-10:~] esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c
1 Path Selection Policy: VMW_PSP_FIXED
24 Path Selection Policy: VMW_PSP_MRU
2 Path Selection Policy: VMW_PSP_RR
[root@ESXi-10:~] esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -P "VMW_PSP_RR" -O iops=1 -c "tpgs_on" -V "3PARdata" -M "VV" -e "HP 3PAR custom SATP Claimrule"
[root@ESXi-10:~] esxcli storage nmp satp rule list -s VMW_SATP_ALUA
Name Device Vendor Model Driver Transport Options Rule Group Claim Options Default PSP PSP Options Description
------------- ------ -------- --------- ------ --------- -------------------------- ---------- ------------- ----------- ----------- ----------------------------------------
VMW_SATP_ALUA LSI INF-01-00 reset_on_attempted_reserve system tpgs_on VMW_PSP_MRU NetApp E-Series arrays with ALUA support
VMW_SATP_ALUA NETAPP reset_on_attempted_reserve system tpgs_on VMW_PSP_RR NetApp arrays with ALUA support
VMW_SATP_ALUA IBM 2810XIV system tpgs_on VMW_PSP_RR IBM 2810XIV arrays with ALUA support
VMW_SATP_ALUA IBM 2107900 reset_on_attempted_reserve system VMW_PSP_RR
VMW_SATP_ALUA IBM 2145 system VMW_PSP_RR
VMW_SATP_ALUA 3PARdata VV user tpgs_on VMW_PSP_RR iops=1 HP 3PAR custom SATP Claimrule
VMW_SATP_ALUA system tpgs_on Any array with ALUA support
#esxcli storage nmp satp set -s VMW_SATP_ALUA -P VMW_PSP_RR
Reboot your ESXI host
[root@ESXi-10:~] esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c
1 Path Selection Policy: VMW_PSP_FIXED
26 Path Selection Policy: VMW_PSP_RR
You have set to "RR" policy for all device and do reboot of the host .
Refer Below one
Hi,
I did it but nothing has change, check it out i.e:
naa.60002ac00000000000000100000098a4
Device Display Name: 3PARdata Fibre Channel Disk (naa.60002ac00000000000000100000098a4)
Storage Array Type: VMW_SATP_ALUA
Storage Array Type Device Config: {implicit_support=on;explicit_support=off; explicit_allow=on;alua_followover=on; action_OnRetryErrors=off; {TPG_id=256,TPG_state=AO}}
Path Selection Policy: VMW_PSP_RR
Path Selection Policy Device Config: {policy=rr,iops=1,bytes=10485760,useANO=0; lastPathIndex=1: NumIOsPending=0,numBytesPending=0}
Path Selection Policy Device Custom Config:
Working Paths: vmhba1:C0:T2:L7, vmhba1:C0:T0:L7
Is Local SAS Device: false
Is USB: false
Is Boot USB Device: false
Path Selection Policy is alright, but there's the same previous errors.
(By the way, we use to keep in mind that i'll set up later ,Queue Full Sample Size and Threshold values to 32 and 4 as HP best pratices said...)
Which is your storage model
I have 2 HP 3PAR 7200
Please follow below steps , it is working one . Just cross checked in my Lab
[root@ESXi-10:~] esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c
1 Path Selection Policy: VMW_PSP_FIXED
24 Path Selection Policy: VMW_PSP_MRU
2 Path Selection Policy: VMW_PSP_RR
[root@ESXi-10:~] esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -P "VMW_PSP_RR" -O iops=1 -c "tpgs_on" -V "3PARdata" -M "VV" -e "HP 3PAR custom SATP Claimrule"
[root@ESXi-10:~] esxcli storage nmp satp rule list -s VMW_SATP_ALUA
Name Device Vendor Model Driver Transport Options Rule Group Claim Options Default PSP PSP Options Description
------------- ------ -------- --------- ------ --------- -------------------------- ---------- ------------- ----------- ----------- ----------------------------------------
VMW_SATP_ALUA LSI INF-01-00 reset_on_attempted_reserve system tpgs_on VMW_PSP_MRU NetApp E-Series arrays with ALUA support
VMW_SATP_ALUA NETAPP reset_on_attempted_reserve system tpgs_on VMW_PSP_RR NetApp arrays with ALUA support
VMW_SATP_ALUA IBM 2810XIV system tpgs_on VMW_PSP_RR IBM 2810XIV arrays with ALUA support
VMW_SATP_ALUA IBM 2107900 reset_on_attempted_reserve system VMW_PSP_RR
VMW_SATP_ALUA IBM 2145 system VMW_PSP_RR
VMW_SATP_ALUA 3PARdata VV user tpgs_on VMW_PSP_RR iops=1 HP 3PAR custom SATP Claimrule
VMW_SATP_ALUA system tpgs_on Any array with ALUA support
#esxcli storage nmp satp set -s VMW_SATP_ALUA -P VMW_PSP_RR
Reboot your ESXI host
[root@ESXi-10:~] esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c
1 Path Selection Policy: VMW_PSP_FIXED
26 Path Selection Policy: VMW_PSP_RR
For the first command
# esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c
1 Path Selection Policy: VMW_PSP_FIXED
73 Path Selection Policy: VMW_PSP_RR
So i'll do again the rule as you wrote, and applied it again, i also have :
1 Path Selection Policy: VMW_PSP_FIXED
73 Path Selection Policy: VMW_PSP_RR
which it seems ok, we're right ?
But i get the same kind of error at booting : (i took a screenshot cause i can't paste you the log cause i can't have access until the esxi boot completely, which takes me around 30min...and more ! )
as per the output all your device are in RR except local device .
Also I can see device successfully registered with NMP .
Refer Below may it helps you
I've already checked these site, but i can't find informations about the Thin Provisioning mode to change that bit.
I'm gonna check it out and let you know what happened.
Hi,
as per your earlier post you received the messages for device naa.50002ac0de009893.
Could it be that this is the array itself?
Some arrays require to present a device with LUN ID 0 to a host, if not configured properly, the host will see a pseudo LUN with LUN ID 0.
Those pseudo LUNs aren't really usable but available when performing a SCSI scan.
Regarding your long delays during reboots, are you running virtual MS clusters using pRDMs?
If so, did you properly configure the disks assigned to those clusters?
Happy troubleshooting
Ralf
Hi Ralf,
Before you answer me, i tried a new installation without any configuration, just to check in the first boot if there's issues or not => I still have issues, the ESXi takes a very long time to boot as I had in my first post.
I was thinking about the array itself or the physical FC controller or probably bad configuration somewhere.
About the LUN0, i haven't heard about it in HP Best Practices document but i'm gonna check it out.
About long delays, i didn't have any VM or Microsoft Cluster on this ESXi right now.
Hi,
when testing with the new installation, how does this act without connecting the storage, will the boot times than be back to normal?
And when you check the vmkernel.log & vmkwarning.log with attached storage, where do the system spend the time during booting?
Good luck,
Ralf
For those which will have that kind of problem, i found out how to resolve it besides previous answers of this topic :
- list/count all the naa.
- think to change Perenially true if they are RDM
- put Queue Full Threshold and Queue Full Sample Size to the constructor value (4 and 32 following HP 3PAR Best Practices)
- check out what are their Path Selection Policy (in my case it was VMW_PSP_MRU but it wasn't the right one)
- add and set Custom SATP ALUA rule as a default PSP for vmw satp alua
- reboot
- check out again the Path Selection Policy.
In my case this works great, i need to try it on my others ESXi then i'll update it if needed.