VMware Cloud Community
GaelV
Enthusiast
Enthusiast
Jump to solution

NMP Plugin set device rule to VMW_PSP_MRU instead of VMW_PSP_MRR

Hi fellows !

I had an issue next to ESXi update (HP Custom ESXi 5.5.0 U3 Build 3568722) on a HP ProLiant DL360 G7, freshly updated with the last Service Pack SPP , it works pretty well except the boot time, he takes around 30min to boot because it's stuck to that kind of log for each device (+multipathing), like he scan all

2016-11-30T06:51:17.793Z cpu4:33206)ScsiDeviceIO: 7499: Get VPD 86 Inquiry for device "naa.50002ac0de009893" from Plugin "NMP" failed. Not supported

2016-11-30T06:51:17.796Z cpu4:33206)ScsiDeviceIO: 6202: Could not detect setting of QErr for device naa.50002ac0de009893. Error SCSI reservation conflict.

2016-11-30T06:51:17.796Z cpu4:33206)ScsiDeviceIO: 6716: Could not detect setting of sitpua for device naa.50002ac0de009893. Error SCSI reservation conflict.

2016-11-30T06:51:17.798Z cpu4:33206)ScsiDevice: 3445: Successfully registered device "naa.50002ac0de009893" from plugin "NMP" of type 0

2016-11-30T06:51:17.798Z cpu4:33206)ScsiEvents: 301: EventSubsystem: Device Events, Event Mask: 180, Parameter: 0x41098399cf30, Registered!

2016-11-30T06:51:17.802Z cpu4:33206)StorageApdHandler: 698: APD Handle  Created with lock[StorageApd0x4108a6]

2016-11-30T06:51:17.802Z cpu4:33206)ScsiEvents: 501: Event Subsystem: Device Events, Created!

2016-11-30T06:51:17.803Z cpu4:33206)VMWARE SCSI Id: Id for vmhba2:C0:T0:L27

0x50 0x00 0x2a 0xc0 0xdc 0x00 0x98 0x93 0x56 0x56 0x20 0x20 0x20 0x20

2016-11-30T06:51:17.805Z cpu4:33206)VMWARE SCSI Id: Id for vmhba1:C0:T1:L27

I know some people had this problem in the past, there is no really solutions that have been brought but i also observed a strange things , in esx.conf, some devices are discovered with VMW_PSP_RR (thanks to a user custom rule) which is ok and that's what i want, but the most use VMW_PSP_MRU :

/storage/plugin/NMP/device[naa.60002ac0000000000000000d000098a4]/psp = "VMW_PSP_RR"

/storage/plugin/NMP/device[naa.50002ac0620098a4]/psp = "VMW_PSP_MRU"

/storage/plugin/NMP/device[naa.50002ac145009893]/psp = "VMW_PSP_MRU"

/storage/plugin/NMP/device[naa.50002ac145009893]/preferred = "fc.20000000c9691329:10000000c9691329-fc.2ff70002ac009893:20120002ac009893-naa.50002ac145009893"

/storage/plugin/NMP/device[naa.50002ac0e7009893]/psp = "VMW_PSP_MRU"

What can i do to exclude the VMW_PSP_MRU options from all my devices ? and/or How can i handle the Plugin "NMP" failed ?

Do someone have an advice ?

Regards,

Gael

1 Solution

Accepted Solutions
RAJ_RAJ
Expert
Expert
Jump to solution

Please follow below steps  , it is working one . Just cross checked in my Lab

[root@ESXi-10:~] esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c

      1    Path Selection Policy: VMW_PSP_FIXED

     24    Path Selection Policy: VMW_PSP_MRU

      2    Path Selection Policy: VMW_PSP_RR

[root@ESXi-10:~] esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -P "VMW_PSP_RR" -O iops=1 -c "tpgs_on" -V "3PARdata" -M "VV" -e "HP 3PAR custom SATP Claimrule"

[root@ESXi-10:~] esxcli storage nmp satp rule list -s VMW_SATP_ALUA

Name           Device  Vendor    Model      Driver  Transport  Options                     Rule Group  Claim Options  Default PSP  PSP Options  Description            

-------------  ------  --------  ---------  ------  ---------  --------------------------  ----------  -------------  -----------  -----------  ----------------------------------------

VMW_SATP_ALUA          LSI       INF-01-00                     reset_on_attempted_reserve  system      tpgs_on        VMW_PSP_MRU               NetApp E-Series arrays with ALUA support

VMW_SATP_ALUA          NETAPP                                  reset_on_attempted_reserve  system      tpgs_on        VMW_PSP_RR                NetApp arrays with ALUA support

VMW_SATP_ALUA          IBM       2810XIV                                                   system      tpgs_on        VMW_PSP_RR                IBM 2810XIV arrays with ALUA support

VMW_SATP_ALUA          IBM       2107900                       reset_on_attempted_reserve  system                     VMW_PSP_RR                                       

VMW_SATP_ALUA          IBM       2145                                                      system                     VMW_PSP_RR                                       

VMW_SATP_ALUA          3PARdata  VV                                                        user        tpgs_on        VMW_PSP_RR   iops=1       HP 3PAR custom SATP Claimrule

VMW_SATP_ALUA                                                                              system      tpgs_on                                  Any array with ALUA support

#esxcli storage nmp satp  set  -s VMW_SATP_ALUA -P VMW_PSP_RR

Reboot your ESXI host

[root@ESXi-10:~] esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c

      1    Path Selection Policy: VMW_PSP_FIXED

     26    Path Selection Policy: VMW_PSP_RR

RAJESH RADHAKRISHNAN VCA -DCV/WM/Cloud,VCP 5 - DCV/DT/CLOUD, ,VCP6-DCV, EMCISA,EMCSA,MCTS,MCPS,BCFA https://ae.linkedin.com/in/rajesh-radhakrishnan-76269335 Mark my post as "helpful" or "correct" if I've helped resolve or answered your query!

View solution in original post

12 Replies
RAJ_RAJ
Expert
Expert
Jump to solution

You have set to "RR"  policy for all device and do reboot of the host .

Refer Below one

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=20005...

RAJESH RADHAKRISHNAN VCA -DCV/WM/Cloud,VCP 5 - DCV/DT/CLOUD, ,VCP6-DCV, EMCISA,EMCSA,MCTS,MCPS,BCFA https://ae.linkedin.com/in/rajesh-radhakrishnan-76269335 Mark my post as "helpful" or "correct" if I've helped resolve or answered your query!
Reply
0 Kudos
GaelV
Enthusiast
Enthusiast
Jump to solution

Hi,

I did it but nothing has change, check it out i.e:

naa.60002ac00000000000000100000098a4

   Device Display Name: 3PARdata Fibre Channel Disk (naa.60002ac00000000000000100000098a4)

   Storage Array Type: VMW_SATP_ALUA

   Storage Array Type Device Config: {implicit_support=on;explicit_support=off; explicit_allow=on;alua_followover=on; action_OnRetryErrors=off; {TPG_id=256,TPG_state=AO}}

   Path Selection Policy: VMW_PSP_RR

   Path Selection Policy Device Config: {policy=rr,iops=1,bytes=10485760,useANO=0; lastPathIndex=1: NumIOsPending=0,numBytesPending=0}

   Path Selection Policy Device Custom Config:

   Working Paths: vmhba1:C0:T2:L7, vmhba1:C0:T0:L7

   Is Local SAS Device: false

   Is USB: false

   Is Boot USB Device: false

Path Selection Policy is alright, but there's the same previous errors.

(By the way, we use to keep in mind that i'll set up later ,Queue Full Sample Size and Threshold values to 32 and 4 as HP best pratices said...)

Reply
0 Kudos
RAJ_RAJ
Expert
Expert
Jump to solution

Which is your storage model

RAJESH RADHAKRISHNAN VCA -DCV/WM/Cloud,VCP 5 - DCV/DT/CLOUD, ,VCP6-DCV, EMCISA,EMCSA,MCTS,MCPS,BCFA https://ae.linkedin.com/in/rajesh-radhakrishnan-76269335 Mark my post as "helpful" or "correct" if I've helped resolve or answered your query!
Reply
0 Kudos
GaelV
Enthusiast
Enthusiast
Jump to solution

I have 2 HP 3PAR 7200

Reply
0 Kudos
RAJ_RAJ
Expert
Expert
Jump to solution

Please follow below steps  , it is working one . Just cross checked in my Lab

[root@ESXi-10:~] esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c

      1    Path Selection Policy: VMW_PSP_FIXED

     24    Path Selection Policy: VMW_PSP_MRU

      2    Path Selection Policy: VMW_PSP_RR

[root@ESXi-10:~] esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -P "VMW_PSP_RR" -O iops=1 -c "tpgs_on" -V "3PARdata" -M "VV" -e "HP 3PAR custom SATP Claimrule"

[root@ESXi-10:~] esxcli storage nmp satp rule list -s VMW_SATP_ALUA

Name           Device  Vendor    Model      Driver  Transport  Options                     Rule Group  Claim Options  Default PSP  PSP Options  Description            

-------------  ------  --------  ---------  ------  ---------  --------------------------  ----------  -------------  -----------  -----------  ----------------------------------------

VMW_SATP_ALUA          LSI       INF-01-00                     reset_on_attempted_reserve  system      tpgs_on        VMW_PSP_MRU               NetApp E-Series arrays with ALUA support

VMW_SATP_ALUA          NETAPP                                  reset_on_attempted_reserve  system      tpgs_on        VMW_PSP_RR                NetApp arrays with ALUA support

VMW_SATP_ALUA          IBM       2810XIV                                                   system      tpgs_on        VMW_PSP_RR                IBM 2810XIV arrays with ALUA support

VMW_SATP_ALUA          IBM       2107900                       reset_on_attempted_reserve  system                     VMW_PSP_RR                                       

VMW_SATP_ALUA          IBM       2145                                                      system                     VMW_PSP_RR                                       

VMW_SATP_ALUA          3PARdata  VV                                                        user        tpgs_on        VMW_PSP_RR   iops=1       HP 3PAR custom SATP Claimrule

VMW_SATP_ALUA                                                                              system      tpgs_on                                  Any array with ALUA support

#esxcli storage nmp satp  set  -s VMW_SATP_ALUA -P VMW_PSP_RR

Reboot your ESXI host

[root@ESXi-10:~] esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c

      1    Path Selection Policy: VMW_PSP_FIXED

     26    Path Selection Policy: VMW_PSP_RR

RAJESH RADHAKRISHNAN VCA -DCV/WM/Cloud,VCP 5 - DCV/DT/CLOUD, ,VCP6-DCV, EMCISA,EMCSA,MCTS,MCPS,BCFA https://ae.linkedin.com/in/rajesh-radhakrishnan-76269335 Mark my post as "helpful" or "correct" if I've helped resolve or answered your query!
GaelV
Enthusiast
Enthusiast
Jump to solution

For the first command

# esxcli storage nmp device list |grep "Path Selection Policy:" |sort |uniq -c

      1    Path Selection Policy: VMW_PSP_FIXED

     73    Path Selection Policy: VMW_PSP_RR

So i'll do again the rule as you wrote, and applied it again, i also have :

  1    Path Selection Policy: VMW_PSP_FIXED

     73    Path Selection Policy: VMW_PSP_RR

which it seems ok, we're right ?

But i get the same kind of error at booting : (i took a screenshot cause i can't paste you the log cause i can't have access until the esxi boot completely, which takes me around 30min...and more ! )

pastedImage_4.png

Reply
0 Kudos
RAJ_RAJ
Expert
Expert
Jump to solution

as per the output all your device are in RR except local device .

Also I can see device successfully registered with NMP . 

Refer Below may it helps you

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=20452...

http://cormachogan.com/2012/12/17/could-not-detect-setting-of-sitpua-for-device-naa-xxx-error-not-su...

RAJESH RADHAKRISHNAN VCA -DCV/WM/Cloud,VCP 5 - DCV/DT/CLOUD, ,VCP6-DCV, EMCISA,EMCSA,MCTS,MCPS,BCFA https://ae.linkedin.com/in/rajesh-radhakrishnan-76269335 Mark my post as "helpful" or "correct" if I've helped resolve or answered your query!
Reply
0 Kudos
GaelV
Enthusiast
Enthusiast
Jump to solution

I've already checked these site, but i can't find informations about the Thin Provisioning mode to change that bit.

I'm gonna check it out and let you know what happened.

Reply
0 Kudos
kastlr
Expert
Expert
Jump to solution

Hi,

as per your earlier post you received the messages for device naa.50002ac0de009893.

Could it be that this is the array itself?

Some arrays require to present a device with LUN ID 0 to a host, if not configured properly, the host will see a pseudo LUN with LUN ID 0.

Those pseudo LUNs aren't really usable but available when performing a SCSI scan.

Regarding your long delays during reboots, are you running virtual MS clusters using pRDMs?

If so, did you properly configure the disks assigned to those clusters?

ESXi/ESX hosts with visibility to RDM LUNs being used by MSCS nodes with RDMs may take a long time t...

Happy troubleshooting

Ralf


Hope this helps a bit.
Greetings from Germany. (CEST)
Reply
0 Kudos
GaelV
Enthusiast
Enthusiast
Jump to solution

Hi Ralf,

Before you answer me, i tried a new installation without any configuration, just to check in the first boot if there's issues or not => I still have issues, the ESXi takes a very long time to boot as I had in my first post.

I was thinking about the array itself or the physical FC controller or probably bad configuration somewhere.

About the LUN0, i haven't heard about it in HP Best Practices document but i'm gonna check it out.

About long delays, i didn't have any VM or Microsoft Cluster on this ESXi right now.

Reply
0 Kudos
kastlr
Expert
Expert
Jump to solution

Hi,

when testing with the new installation, how does this act without connecting the storage, will the boot times than be back to normal?

And when you check the vmkernel.log & vmkwarning.log with attached storage, where do the system spend the time during booting?

Good luck,

Ralf


Hope this helps a bit.
Greetings from Germany. (CEST)
Reply
0 Kudos
GaelV
Enthusiast
Enthusiast
Jump to solution

For those which will have that kind of problem, i found out how to resolve it besides previous answers of this topic :

- list/count all the naa.

- think to change Perenially true if they are RDM

- put Queue Full Threshold and Queue Full Sample Size to the constructor value (4 and 32 following HP 3PAR Best Practices)
- check out what are their Path Selection Policy (in my case it was VMW_PSP_MRU but it wasn't the right one)

- add and set Custom SATP ALUA rule as a default PSP for vmw satp alua

- reboot

- check out again the Path Selection Policy.

In my case this works great, i need to try it on my others ESXi then i'll update it if needed.