ESXi

 View Only
Expand all | Collapse all

WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

  • 1.  WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 24, 2009 07:50 AM

    I am currently testing ESXi 4 by adding one ESXi 4 host to our VMware production cluster. The host is a HP BL460c G1 blade running ESXi 4 build 175625 connected to a HP EVA 6000 storage array. The ESXi 4 host seems to run fine but i noticed the following kernel warnings in the system log:

    Jul 18 17:00:27 vmkernel: 2:07:08:24.308 cpu7:40478)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x4100021b8480) to NMP device "naa.600508b4000554df00007000034a0000" failed on physical path "vmhba1:C0:T0:L11" H:0x2 D:0x0 P:0x0 Possible sense data: Jul 18 17:00:27 0x0 0x0 0x0.

    Jul 18 17:00:27 vmkernel: 2:07:08:24.308 cpu7:40478)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.600508b4000554df00007000034a0000" state in doubt; requested fast path state update...

    Jul 18 17:00:27 vmkernel: 2:07:08:24.308 cpu7:40478)ScsiDeviceIO: 747: Command 0x2a to device "naa.600508b4000554df00007000034a0000" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    These warnings don't appear on our ESXi 3 hosts. These warning seems something to do with the multipath policies but i don't understand the warning message. This warnings are reported frequently on multiple lun's. Does anybody knows what these warnings mean?



  • 2.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 27, 2009 08:40 AM

    I am experiencing the same problems here in my production environment after upgrading from VMware ESX 3.5-U2 to VMware ESX 4 (Build 164009). My hardware setup:

    - HP ProLiant DL360 G5 and DL360 G6 servers,

    - Emulex LPe11000 HBAs (dual channel, 4 Gbps FibreChannel),

    - transtec Provigo 550 FibreChannel SAN (two controllers, two 4 Gbps paths each),

    - Cisco MDS9124 switches.

    The SAN is connected to the two FC-switches, with two paths per switch (one path per controller per switch). Every server is connected with one path per switch to the switches. I.e., VMware ESX detects four paths to every LUN on the SAN. The path policy in ESX is set to "Fixed". (I have tested path failover -- and it works.)

    /var/log/vmkernel is full of messages like this one:

    Jul 27 10:21:44 vmware05 vmkernel: 5:00:22:01.094 cpu1:4262)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x4100021153c0) to NMP device "naa.60050cc000205a840000000000000023" failed on physical path "vmhba2:C0:T1:L0" H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Jul 27 10:21:44 vmware05 vmkernel: 5:00:22:01.094 cpu1:4262)ScsiDeviceIO: 747: Command 0x2a to device "naa.60050cc000205a840000000000000023" failed H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Any ideas?

    Thanks in advance,

    Josef.



  • 3.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 28, 2009 06:49 AM

    Some more details ...

    Log-entries in /var/log/vmkernel are as described in the previous posting:

    Jul 28 08:39:54 vmware05 vmkernel: 0:20:44:22.115 cpu1:4259)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x4100020c7f40) to NMP device "naa.60050cc000205a840000000000000023" failed on physical path "vmhba2:C0:T1:L0" H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Jul 28 08:39:54 vmware05 vmkernel: 0:20:44:22.115 cpu1:4259)ScsiDeviceIO: 747: Command 0x2a to device "naa.60050cc000205a840000000000000023" failed H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.

    In between these log entries there are also messages like these:

    Jul 28 08:40:34 vmware05 vmkernel: 0:20:45:02.471 cpu1:4275)WARNING: NMP: nmp_RegisterDevice: Registration of NMP device with primary uid 'mpx.vmhba2:C0:T0:L6' failed. Already exists

    Jul 28 08:40:34 vmware05 vmkernel: 0:20:45:02.471 cpu1:4275)VMWARE SCSI Id: Id for vmhba2:C0:T1:L6

    Jul 28 08:40:34 vmware05 vmkernel: 0x46 0x46 0x45 0x30 0x35 0x41 0x38 0x34 0x50 0x52 0x4f 0x56 0x49 0x47

    Jul 28 08:40:34 vmware05 vmkernel: 0:20:45:02.471 cpu1:4275)ScsiUid: 370: Existing device mpx.vmhba1:C0:T0:L6 already has uid vml.0103060000464645303541383450524f564947

    Jul 28 08:40:34 vmware05 vmkernel: 0:20:45:02.471 cpu1:4275)ScsiDevice: 1734: Failing registration of device 'mpx.vmhba2:C0:T1:L6': failed to add legacy uid vml.0103060000464645303541383450524f564947on path vmhba2:C0:T1:L6: Already exists

    About every five seconds the following log entries (three lines per "record") are written to /var/log/vmkwarning:

    Jul 26 01:34:43 vmware05 vmkernel: 3:15:35:02.465 cpu2:4418)WARNING: NMP: nmp_RegisterDevice: Registration of NMP device with primary uid 'mpx.vmhba1:C0:T1:L6' failed. Already exists

    Jul 26 01:34:43 vmware05 vmkernel: 3:15:35:02.466 cpu2:4418)WARNING: NMP: nmp_RegisterDevice: Registration of NMP device with primary uid 'mpx.vmhba2:C0:T0:L6' failed. Already exists

    Jul 26 01:34:43 vmware05 vmkernel: 3:15:35:02.466 cpu2:4418)WARNING: NMP: nmp_RegisterDevice: Registration of NMP device with primary uid 'mpx.vmhba2:C0:T1:L6' failed. Already exists

    Output of 'esxcfg-scsidevs -l' is for this device:

    naa.60050cc000205a840000000000000023

    Device Type: Direct-Access

    Size: 1361358 MB

    Display Name: transtec Fibre Channel Disk (naa.60050cc000205a840000000000000023)

    Plugin: NMP

    Console Device: /dev/sdf

    Devfs Path: /vmfs/devices/disks/naa.60050cc000205a840000000000000023

    Vendor: transtec Model: PROVIGO 550F Revis:

    SCSI Level: 5 Is Pseudo: false Status: on

    Is RDM Capable: true Is Removable: false

    Is Local: false

    Other Names:

    vml.020000000060050cc000205a84000000000000002350524f564947

    Josef.



  • 4.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 29, 2009 09:18 AM

    Hi all,

    I tried to investigate the issue, had a conversation with our SAN vendor, and I think that I do, in fact, have some answers.

    (1) nmp_CompleteCommandForPath ... Command 0x2a to NMP device failed on physical path ... Possible sense data 0x0 0x0 0x0:

    (1a) Analysis:

    Jul 28 08:39:54 vmware05 vmkernel: 0:20:44:22.115 cpu1:4259)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x4100020c7f40) to NMP device "naa.60050cc000205a840000000000000023" failed on physical path "vmhba2:C0:T1:L0" H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Jul 28 08:39:54 vmware05 vmkernel: 0:20:44:22.115 cpu1:4259)ScsiDeviceIO: 747: Command 0x2a to device "naa.60050cc000205a840000000000000023" failed H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.

    The sense codes logged by VMware stand for "TASK SET FULL". Our SAN vendor told us that, at least for them, this is a known "issue". In fact, it is not even a real issue. The explanation is: The SAN's controller has a write cache (for each array). When a single host, for example, writes a lot of data to a single array, the write cache might be full, and other hosts might not be able to write to the write cache. Our SAN offers a setting for "overload management". When overload management is enabled the hosts that have to wait until the write cache is free will be sent the message "TASK SET FULL" by the SAN's controller. I.e., these hosts cannot write to the SAN at the moment and will have to wait. VMware waits and logs this event with the corresponding sense data for "TASK SET FULL" to /var/log/vmkernel.

    (1b) Additional information:

    There is a VMware Knowledge Base article on SCSI sense codes:

    The log message above contains the following codes:

    - H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0

    The interesting section here is the code starting with "D" (D stands for "device status"). Device status 0x28 means for "TASK SET FULL".

    (1c) Solution:

    I re-configured our SAN. The write cache setting for each array was set to "maximum", and I reduced it to a fixed amount. Hence, the arrays really act independently from each other. (Otherwise a write cache congestion on one array can have an impact on other arrays). Moreover, I changed the "overload management setting" from:

    - Enabled: Commands that can not be accepted before the response timeout will fail with the status TASK SET FULL (0x28).

    to:

    - Disabled: No target queue full timeout will be enforced. Commands will wait until they can be processed or are timed out in the transport layer.

    Furthermore, I activated the option "Enable cache Writethrough operation when write cache is full." (I prefer slow write operations to the SAN to no write operations.) :smileywink:

    (1d) Note:

    The log messages do not appear any longer. (At least at the moment.) However, the log messages did not appear in ESX 3.5-U2 anyway -- they only started appearing in ESX 4.0. So either ESX 4.0 handles SCSI write commands in a different way (rather unlikely) or ESX 4.0 simply logs more or increasingly detailed messages.

    (2) nmp_RegisterDevice: Registration of NMP device failed:

    (2a) Analysis:

    Jul 26 01:34:43 vmware05 vmkernel: 3:15:35:02.465 cpu2:4418)WARNING: NMP: nmp_RegisterDevice: Registration of NMP device with primary uid 'mpx.vmhba1:C0:T1:L6' failed. Already exists

    Jul 26 01:34:43 vmware05 vmkernel: 3:15:35:02.466 cpu2:4418)WARNING: NMP: nmp_RegisterDevice: Registration of NMP device with primary uid 'mpx.vmhba2:C0:T0:L6' failed. Already exists

    Jul 26 01:34:43 vmware05 vmkernel: 3:15:35:02.466 cpu2:4418)WARNING: NMP: nmp_RegisterDevice: Registration of NMP device with primary uid 'mpx.vmhba2:C0:T1:L6' failed. Already exists

    I have six LUNs on the SAN (LUN 0 through LUN 5). LUN 6 is the SAN's controller. So these error messages correspond to the SAN's controller and not to any of the datastores.

    Unfortunately, I do not have an answer for this issue yet ... and /var/log/vmkernel is filling up rapidly -- at 26,000 lines or 4.5 MB per hour.

    What I'd like to see is a ESX setting that lets me disable these messages for a given LUN.

    Best regards,

    Josef.



  • 5.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 29, 2009 02:59 PM

    Hi all,

    After contacting VMware support I'd like to share the information with you. The problem i am experiencing is a little bit different as Josef is experiencing. I am experiencing the following warning:

    Jul 18 17:00:27 vmkernel: 2:07:08:24.308 cpu7:40478)ScsiDeviceIO: 747: Command 0x2a to device "naa.600508b4000554df00007000034a0000" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0

    The log message above contains the following codes:

    failed H:0x2 D:0x0 P:0x0

    The interesting section here is the code starting with "H" (H stands for "Host status"). Host status 0x2 means "HOST BUSY"

    Vmware support gives the following explanation for this:

    -


    I checked with our bug database and as I had thought previously, H:0x2 D:0x0 P:0x0 translates to hba busy. The driver for whatever reason failed the i/o with a busy status. These can occur for any number of reasons. These failures are automatically retried by ESX.

    Jul 18 17:00:27 vmkernel: 2:07:08:24.308 cpu7:40478)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.600508b4000554df00007000034a0000" state in doubt; requested fast path state update..."

    This messaging will initially indicate that a NMP command was not responsive on a device, thus the NMP plugin 'doubted' the sate of the lun, i.e was it busy, was it on a path, was it responsive. This could be a driver problem or spurious logging. A bug for this message has been logged, and as yet is not an issue, unless followed by failing I/O or VM failures.

    -


    So it looks like a bug, but as yet is not an issue. Hope this gives some clarification!

    Regards,

    Ted



  • 6.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 31, 2009 08:27 AM

    I am also occasionally receiving the error you just mentioned supahted, specifically the 0x2 HOST BUSY error followed by the "state in doubt" warning on our EVA8000.

    My immediate guess is that the error differs depending on the type of storage array you have, so HP EVA's may be giving out HOST BUSY errors, while other arrays may be giving out the TASK SET FULL error.

    The error, at least for me, also seems to be load related, as it happens more often when VCB backup is running. Actually it almost only occurs when VCB is running.

    Could this be related to SCSI reservation errors, eg, the resolution could be to split LUNs up into smaller LUNs?

    Clarification from a VMware engineer/expert would be appreciated

    [edit}

    After reading the article mentioned above: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=289902 i am a bit surprised to see that the H:0x2 code actually suggests that the HBA is issuing the error. So is it the HBA which is "Host Busy" or is it simply timing out reads/writes to the storage array?



  • 7.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 31, 2009 08:30 AM

    Ich befinde mich vom 31.07.2009 bis einschließlich 14.08.2009 nicht im Hause. In dringenden Fällen wenden Sie sich bitte an technik@witcom.de oder an 0611 7803003

    Mit freundlichen Grüssen

    Carsten Buchberger



  • 8.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Aug 06, 2009 03:02 PM

    We stood up a 3 node ESX test cluster connected to a CX3-80. We added 4 LUNs and put VMs on 3 of the LUNs. We set up continous pings to the VMs. Whenever we remove the one LUN without VMs, the other VMs will eventually "freeze" after a period of time. Pings will fail and the console of the VM will be a blank black screen.

    We tired adding a LUN 0 and that didn't fix it. We changed the pathing policy from round robin to MRU. Still broken. We disabled the auto rescan. Still broken. The time of VM failure could be 5 minutes to 1 hour. This will keep reoccuring at different intervals. The only way to stop it is by manually rescanning on the ESX host.

    Has anyone removed a lun on an ESX4 cluster and experienced the same thing?

    Thanks.



  • 9.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Sep 18, 2009 07:51 AM

    Im experiencing the same error on esx4 on my 3.5u4 i didnt get the error

    Sep 18 09:59:06 esxbl03 vmkernel: 6:16:23:14.324 cpu7:4103)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x41000523fa00) to NMP device "naa.600508b4001076bc0000800000

    c60000" failed on physical path "vmhba2:C0:T1:L4" H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Sep 18 09:59:06 esxbl03 vmkernel: 6:16:23:14.324 cpu7:4103)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.600508b4001076bc0000800000c60000" state in d

    oubt; requested fast path state update...

    Sep 18 09:59:06 esxbl3 vmkernel: 6:16:23:14.324 cpu7:4103)ScsiDeviceIO: 747: Command 0x2a to device "naa.600508b4001076bc0000800000c60000" failed H:0x8 D:0x0 P:0x0 Po

    ssible sense data: 0x0 0x0 0x0.

    any sollution for these problem ?



  • 10.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Sep 18, 2009 09:16 AM

    i solved my problem with changing the mpio to Fixed in the datastore



  • 11.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Sep 18, 2009 11:08 AM

    I just tried it - didnt help here...



  • 12.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Oct 07, 2009 12:32 PM

    Same problem here

    3 hosts ESX4 build 175625, storage Datacore SANmeloidy 3.01

    Oct 7 14:18:15 srvesx3 vmkernel: 6:00:55:54.256 cpu5:4268)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60030d9056566f6c5341533200000000" state in doubt; requested fast path state update...

    Oct 7 14:18:15 srvesx3 vmkernel: 6:00:55:54.256 cpu5:4268)ScsiDeviceIO: 747: Command 0x28 to device "naa.60030d9056566f6c5341533200000000" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Oct 7 14:21:44 srvesx3 vmkernel: 6:00:59:23.076 cpu7:7497)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410006032080) to NMP device "naa.60030d9056566f6c5341533200000000" failed on physical path "vmhba1:C0:T1:L1" H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Oct 7 14:21:44 srvesx3 vmkernel: 6:00:59:23.076 cpu7:7497)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60030d9056566f6c5341533200000000" state in doubt; requested fast path state update...

    Oct 7 14:21:44 srvesx3 vmkernel: 6:00:59:23.076 cpu7:7497)ScsiDeviceIO: 747: Command 0x2a to device "naa.60030d9056566f6c5341533200000000" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Same error on local storage (HP P400 controler)

    Oct 7 14:34:05 srvesx3 vmkernel: 6:01:11:43.667 cpu6:4102)NMP: nmp_CompleteCommandForPath: Command 0x12 (0x41000601fd40) to NMP device "mpx.vmhba2:C0:T1:L0" failed on physical path "vmhba2:C0:T1:L0" H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.

    Oct 7 14:34:05 srvesx3 vmkernel: 6:01:11:43.667 cpu6:4102)ScsiDeviceIO: 747: Command 0x12 to device "mpx.vmhba2:C0:T1:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.

    Oct 7 14:34:05 srvesx3 vmkernel: 6:01:11:43.868 cpu6:4102)NMP: nmp_CompleteCommandForPath: Command 0x12 (0x41000613d000) to NMP device "mpx.vmhba2:C0:T0:L0" failed on physical path "vmhba2:C0:T0:L0" H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.

    Oct 7 14:34:05 srvesx3 vmkernel: 6:01:11:43.868 cpu6:4102)ScsiDeviceIO: 747: Command 0x12 to device "mpx.vmhba2:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.

    Anyone find the solution to this warning ?

    Regards,

    Boris



  • 13.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Oct 29, 2009 03:19 PM

    Just received the following email from VMware support:

    "I'm writing you in regards to the service request 1432691811 concerning unresponsive VMs after un-presenting a LUN.

    I'd like to inform you that our engineering has identified this issue and working on a fix. I will inform you again with additional information as soon as it becomes available."



  • 14.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Nov 03, 2009 09:58 AM

    We have 4 hosts ESX 4.0 build 175625 and 3 ESXi hosts 4.0 build 193498 connected to the storage Datacore SANMelody 2.0.4 Update 1 and the same Problem.

    I have opened the SR with VMware Support. The answer was "Your configuration is not supported. You must update your Datacore SANMelody to version 3.0"

    But I can not upgrade our Datacore SANMelody to version 3 right now.

    Boris, did You open the SR by VMWare Support? Which answer did You get?

    Regards,

    Dmitri



  • 15.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Nov 03, 2009 10:07 AM

    Sehr geehrte Dame sehr geehrter Herr,

    ich befinde mich vom 2. - 10 November 2009 nicht im Hause. Ich empfange Ihre eMail zwar.kann Sie aber nicht bearbeiten. Bitte wenden Sie sich in dringenden Fällen an unsere Technikhotline die unter der Rufnummer 0611 780 3003 zu erreichen ist.

    Mit freundlichen Grüßen

    Carsten Buchberger



  • 16.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Nov 03, 2009 08:53 PM

    No I didn't open a SR by VMware support.

    And you did you open a support request by Datacore ?

    Regards,

    Boris



  • 17.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Nov 03, 2009 09:18 PM

    To solve our problem (see post from end of july) falconstor had to fix its ipstor storage-server. There was indeed a problem with a lun 0 which was not presented in the correct way. After applying the patch we didn't see the error again.



  • 18.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Dec 16, 2009 03:37 AM

    This post has gotten a bit muddled with people replying with different SCSI codes (although they all start with NMP). I'm seeing the same SCSI codes as Ted (H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0) and I've got a VM running Windows 2003 R2 / SQL Server 2005 SP3 (all fully patched) giving me messages in Event Viewer that are listed at the same time

    "SQL Server has encountered ** occurrence(s) of I/O requests taking longer than 15 seconds to complete on file in database (5). The OS file handle is 0x00000648. The offset of the latest long I/O is: 0x000000cb0f2000"

    So I'm not seeing anything as horrible as some customers are having (via Chad's blog post), but it is noticeable to guests that something's not quiet right.

    Running ESX4u1 and have had some shoddy support dealings with tier1 vm support in the past. This post is mainly to link the SQL i/o message back to a vmware storage bug. I'm hesitent to make any advanced config changes on the ESX box from a few "busy" messages, but I'd love to hear other opinions.



  • 19.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jan 19, 2010 01:09 PM

    Hi all!

    Is there a resolution or a hotfix from VMware available so far?

    We´re getting the errors mentioned above, when I´m trying to save a full VM via VCB. (VMs on a Netapp)

    Saving the same VM on a selfbuild-Openfiler causes no Problems. (also connected via iSCSI)



  • 20.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jan 19, 2010 08:05 PM

    If you haven't already, patch with the most recent set of ESX4 patches. I'm still verifying if the below kb is the fix but so far I haven't seen the errors in my vmkernel logs this morning.

    It would be awesome if someone else could reply as well.

    -


    Scratch that. I'm still getting the same NMP messages. I've now learned that not every datastore available to the ESX host is setting off the error. :smileysilly:



  • 21.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jan 20, 2010 07:42 AM

    @katrap:

    The hosts are all on the same patch-level (last patches applied), but still getting the error (only on the Netapp-Storage), on Openfiler-Storage erverythings works fine.

    As Morten Dalgaard wrote:

    "The error, at least for me, also seems to be load related, as it

    happens more often when VCB backup is running. Actually it almost only

    occurs when VCB is running."

    Seems the same here: the higher the load, the more the errors. (For example while restarting VMs)

    Fascinating, that an Opensource-Software works fine and a really expensive storage doesn ´t...

    Help and answers from VMware really appreciated!

    Thanks!



  • 22.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jan 20, 2010 07:58 AM

    I also experienced severe issues with a NETAPP FAS. - Have you upgraded to Ontap 7.3 or higher ? - That's required.

    Henrik



  • 23.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jan 20, 2010 08:04 AM

    Hi Henrik,

    thanks for your answer, will ask this our Storage-Admin.

    At the moment I also try the solution to remove the mpio-driver from the VCB...

    Edit: Unfortunately no improvement... :smileysad:



  • 24.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Mar 20, 2010 01:52 PM

    Has anyone found or gotten a solution to this from vmware yet.on the issue ? I am seeing this on my hosts too.

    state in doubt, requested fast path state update,

    A lot of these errors

    Mar 20 07:07:55 LIC-VM16 vmkernel: 0:18:41:26.931 cpu16:4312)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x4100bb1ccd40) to NMP device "naa.600601600de11a00fa6e3b607a38dd11" failed on physical path "vmhba2:C0:T0:L103" H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Mar 20 07:07:55 LIC-VM16 vmkernel: 0:18:41:26.931 cpu16:4312)ScsiDeviceIO: 747: Command 0x2a to device "naa.600601600de11a00fa6e3b607a38dd11" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Though mostly at one lun at a time, so it seems not the whole storage bus is out, while the error is host busy.

    And last,

    The virtual machines sometimes become unresponsive for a brief period of time.



  • 25.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Mar 20, 2010 02:03 PM

    Sehr geehrte Dame, sehr geehrter Herr,

    ich befinde mich vom 14. - 23. März 2010 nicht im Hause. Ich empfange Ihre eMail zwar, kann Sie aber nicht bearbeiten. Bitte wenden Sie sich in dringenden Fällen an unsere Technikhotline, die unter der Rufnummer 0611 780 3003 zu erreichen ist.

    Mit freundlichen Grüßen

    Carsten Buchberger



  • 26.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted May 17, 2010 06:47 PM

    Doing a fresh install of ESXi 4 , have the latest ESXi firmware and am still experiencing this issue after completing an SRM test (so unable to operationally remove the datastore).

    Was this issue not patched / resolved with ESXi 4?



  • 27.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted May 18, 2010 04:14 PM

    Opened a ticket with VMware support. Evidently this issue is still open on certain arrays (IBM DS is evidently one of them as the DS4700 is what we are being affected with. Setting the apd advanced flag (esxcfg-advcfg -s 1 /VMFS3/FailVolumeOpenIfAPD ) as listed on http://virtualgeek.typepad.com/virtual_geek/2009/12/an-important-vsphere-4-storage-bug-and-workaround.html and recommended by our support agent did not resolve the issue on this array. According to VMware support, they are still working with certain array vendors to fix this issue.

    ugh.



  • 28.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted May 19, 2010 08:57 AM

    The only solutione for us, was to devide the ESX-Boxes into two Clusters, one with the "old" 3,5, the other with 4.0 U1.

    I tried some test-Boxes after several Updates, but the errors still exits.



  • 29.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Oct 26, 2010 05:37 PM

    We're having the same issue running 4.0 U2 on HP blades using Emulex cards attached to Hitachi storage. Did anyone ever come up with a proper solution to this issue? Anyone else on Hitachi storage experience the same problem?



  • 30.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Nov 05, 2010 01:26 PM

    Same issue here, exactly the same errors in /var/log/vmkernel (0x28 errors)

    ESX 4.1 fresh install

    HP Blade 460 G6

    QLogic HBA

    EVA 4400 Controller

    No solutions yet?



  • 31.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Dec 08, 2010 09:57 AM

    Hello I have same issues

    ESX 4.0 U2 with HP blades and emulex cards. as storage we have IBM SVC

    Dec 8 05:24:24 BRUS220 vmkernel: 8:18:25:40.432 cpu6:4102)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410006028a80) to NMP device "naa.6005076801900303e800000000000018" failed on physical path "vmhba1:C0:T4:L68" H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    Dec 8 05:24:24 BRUS220 vmkernel: 8:18:25:40.432 cpu6:4102)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.6005076801900303e800000000000018" state in doubt; requested fast path state update...

    Dec 8 05:24:24 BRUS220 vmkernel: 8:18:25:40.432 cpu6:4102)ScsiDeviceIO: 747: Command 0x2a to device "naa.6005076801900303e800000000000018" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    any solution already available?



  • 32.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Dec 08, 2010 04:33 PM

    Our issue ended up being a bad piece of hardware in the HP c7000 blade chassis specifically the Virtual Connect fibre channel module. It took us a lot of different troubleshooting steps to finally get down to where we could single-out the specific module. Module was replaced and the errors went away. We're seeing some more errors in some other chassis so we're starting the same process on them today.



  • 33.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jan 03, 2011 07:01 PM

    tvdh,

    I have been following this issue in our environment for some time.  Your error codes like mine are a little different than the others.  I'm getting: "NMP: nmp_CompleteCommandForPath: Command 0x2a".  This happens on different LUNs on different hosts.  Could you tell me a little more about your environment to help me troubleshoot...or tell me how you fixed your environment if you have already done so?

    We have and IBM SVC running 5.1.0.4 code.

    ESX hosts having the issue are vSphere 4.0 U1 and U2.

    We have noticed the problem happens on datastores running on DS4800/5300 storage behind the SVC...more so than XIV storage behind the SVC but we have noticed the errors on both.

    We use RR multipathing rather than fixed.  I have not changed back to fixed for testing yet.

    Our errors usually happen around 2 AM, a busy backup time, but have not been able to corrolate the issue with any particular VCB backup or any one system causing the problem.  It happens when no VCB backups run and it happens when some do.  Also it has not been until recently that any VM is seeing issues caused by this.  Obviously this issue has moved up in priority.

    We have a PMR open with IBM and a case open with VMware.  I will post the solution when one comes.

    We do have a 4.1 cluster running on the same SVC without these errors.  It is lightly loaded as our Desktop group is still going through their VDI build/testing phase.  I'm not sure 4.1 is the answer but that is the only thing that runs clean.



  • 34.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jan 04, 2011 03:22 PM

    I wanted to post a follow-up from my previous post where I said that we were still seeing this problem on some hosts that we ruled out hardware issues. Our problem ended up being the use of LUSE (LUN Size Expansion) devices on a Hitachi USP-V array specfically during replication of the LUN to the DR site. A few hours into replication, we would see all kind of SCSI reservation warning and the disk latency would go through the roof. We found a white paper from HDS that recommended not using LUSE devices for VMFS datastores. We followed that advice and have not seen any of these errors since.

    I know that not everyone having this problem is on Hitachi disk, but in our case it was the disk array that was the problem. So, hopefully this info at least helps one person still having issues.



  • 35.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Feb 09, 2011 07:50 PM

    mitchellm3 - I've had the same issue with a DS4800 and ESXi 4.1. We're also seeing this early in the morning and have an open PMR and VMware ticket. Were you able to make any progress?



  • 36.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Feb 09, 2011 08:36 PM

    Progress is slow on this issue.  We have been working daily with IBM and VMware on this issue...moreso with IBM.  We still aren't sure what is causing the problem but we have significantly alleviated it.

    1st and foremost - Anyone using HP insight agents on their HP VM hosts must take a look at this KB article.  It states that if your storage isn't HP, disable two of the IMA services or you could have storage issues.  We have done this across the board and now we aren't seeing as many errors.

    The other thing to check on your DS4800's is what version of storage manager you have.  The newer versions of storage manager run a storage profiler every night at 2 AM.  This basically takes inventory of your config so that next time your DS4800 crashes and IBM support needs to recreate it, they'll have all the info they need.  This info is also found in your "collect all support data" dumps.  Anyway, we set that to run monthly and we haven't seen the big destage errors...corrolated to the vmware errors...on our San Volume Controller.

    We're running much much better but I'm not sold on this problem being completely gone.  We are looking to upgrade the SVC to v5.1.0.8 and eventually to 6.1.



  • 37.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Feb 10, 2011 01:40 PM

    Thanks for the quick follow up. I'll have to check and make sure nobody else has the profiler installed and running. I am running ESXi 4.1 going directly to a DS4800 (no SVC). It also seems to happen early in the morning, between 2:00 AM - 5:00 AM.

    What type of SAN switches are you using? Are you doing any Metro/Global mirroring? We are running Brocade switches and do have aynchronous global mirroring enabled. Also, are you running Trend OfficeScan by any chance? I have yet to be able to rule out Trend from being the cause, although IBM continues to say that we aren't reaching any performance limits on the DS4800.



  • 38.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 27, 2009 09:34 AM

    Hi,

    I have seen the same messages on my ESX4 Servers. In my case I used Lefthand iSCS Storage. Everytime when I start a VCB Job on a new Volume this Events come up... (automount and scrub disabled) In my case, after i uninstalled the MPIO driver from the VCB Proxy everything works fine. But I had to recreate the VMFS Volumes...

    Stefan



  • 39.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 27, 2009 10:47 AM

    I've just opened a VMware support case concerning this warning message. I will post the results to this discussion!



  • 40.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 27, 2009 11:31 AM

    Great -- thanks a lot!

    Josef.



  • 41.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 27, 2009 07:54 PM

    I'm seeing some similiar events. CX3-80 using FC and round robin. We removed one LUN, it causes the VMs on other LUNs to lose their network connectivity. Have a SR open now, but it looks like I'm going to have to call EMC.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.259 cpu6:13355)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x41000224a980) to NMP device "naa.60060160ed741a0024bd2deeffa0dd11" failed on physical path "vmhba2:C0:T1:L66" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.259 cpu6:13355)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a0024bd2deeffa0dd11" state in doubt; requested fast path state update...

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.259 cpu6:13355)ScsiDeviceIO: 747: Command 0x2a to device "naa.60060160ed741a0024bd2deeffa0dd11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.371 cpu6:14106)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x4100021a02c0) to NMP device "naa.60060160ed741a00c2135fabd30cde11" failed on physical path "vmhba2:C0:T1:L53" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.371 cpu6:14106)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a00c2135fabd30cde11" state in doubt; requested fast path state update...

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.371 cpu6:14106)ScsiDeviceIO: 747: Command 0x2a to device "naa.60060160ed741a00c2135fabd30cde11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.



  • 42.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 27, 2009 08:16 PM

    After upgrading one of our ESX Hosts to vSphere we experienced the same problems. /var/log/vmkernel got filled up rapidly with this warning message. The system also got very slow, and it took ages to boot up.

    I found out that ist has to to with the so called Lun 0 (LUNZ on EMC systems) and how it is handled from the vSphere on the one hand and presented by the storage system on the other hand.

    Our scenario

    Storage System 1 : Clariion CX

    Storage System 2 : FalconStor IPStor

    2 ESX 3.5 Hosts and one ESX 4 Host.

    The Clariion presents a couple of Luns to all ESX-Host. The "old" hosts only see the luns which are assigned to them. The 4.0 host sees an additional Lun 0 (LUNZ). Usually this Lun is presented only if there is no Lun with the Host-ID 0 (i guess for scsi compatibility reasons). If you do an "esxcfg-scsidevs -l" you see this LUNZ-Lun, but it has the Flag "Is Pseudo: true".

    The Clariion also had a real Lun 0 with vmfs on it, so we had no problems with it.

    But on FalconStor the first lun with vmfs on it had the host id 1, so falconstor also presented a dummy lun 0, and that was the one which caused the trouble. ESX 4.0 did not recognize that ist was only a fake (the Flag "Is Pseudo" was false) and tried to get it under its control.....

    So what we did was to create a small lun on Falconstor with only 10 MB and present it with the Host-ID 0 to the ESX-Hosts - no problems any longer.

    As soon as Falconstor presents a Lun with HostID 0 it does not use its internal dummy Lun 0 any longer - in my opinion that is the correct method.

    Maybe you have a similar issue.

    Best Regards

    Carsten



  • 43.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 27, 2009 09:41 PM

    VMware support is saying it is a storage issue, so I have opened a ticket with EMC. Waiting on a call back. VMware is telling me that vSphere is aware that LUN 0 is not an acutal LUN. The commands are reads and writes and are failing at the storage array.

    It happened last week to us when we removed a few luns, but only a few VMs went off the net once for a few minutes. The same events showed up in the vkernel logs for every ESX server and all LUNs. Today we removed one LUN, and I didn't force a rescan. The VMs on other LUNs kept going off the network for a couple minutes over and over again. It finally stopped when I did a manual rescan. More events:

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.259 cpu6:13355)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x41000224a980) to NMP device "naa.60060160ed741a0024bd2deeffa0dd11" failed on physical path "vmhba2:C0:T1:L66" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.259 cpu6:13355)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a0024bd2deeffa0dd11" state in doubt; requested fast path state update...

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.259 cpu6:13355)ScsiDeviceIO: 747: Command 0x2a to device "naa.60060160ed741a0024bd2deeffa0dd11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.371 cpu6:14106)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x4100021a02c0) to NMP device "naa.60060160ed741a00c2135fabd30cde11" failed on physical path "vmhba2:C0:T1:L53" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.371 cpu6:14106)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a00c2135fabd30cde11" state in doubt; requested fast path state update...

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.371 cpu6:14106)ScsiDeviceIO: 747: Command 0x2a to device "naa.60060160ed741a00c2135fabd30cde11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.425 cpu6:13355)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410002266f40) to NMP device "naa.60060160ed741a004c71b05d5662de11" failed on physical path "vmhba2:C0:T1:L11" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.425 cpu6:13355)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a004c71b05d5662de11" state in doubt; requested fast path state update...

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.425 cpu6:13355)ScsiDeviceIO: 747: Command 0x2a to device "naa.60060160ed741a004c71b05d5662de11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.831 cpu6:13355)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x41000207e580) to NMP device "naa.60060160ed741a00caeba6dd155cde11" failed on physical path "vmhba2:C0:T0:L10" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.831 cpu6:13355)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a00caeba6dd155cde11" state in doubt; requested fast path state update...

    Jul 27 14:01:18 fohapesx13 vmkernel: 31:23:37:10.831 cpu6:13355)ScsiDeviceIO: 747: Command 0x2a to device "naa.60060160ed741a00caeba6dd155cde11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:19 fohapesx13 vmkernel: 31:23:37:12.128 cpu6:13355)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410002257480) to NMP device "naa.60060160ed741a007811dced5c5bde11" failed on physical path "vmhba2:C0:T1:L9" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:19 fohapesx13 vmkernel: 31:23:37:12.128 cpu6:13355)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a007811dced5c5bde11" state in doubt; requested fast path state update...

    Jul 27 14:01:19 fohapesx13 vmkernel: 31:23:37:12.128 cpu6:13355)ScsiDeviceIO: 747: Command 0x2a to device "naa.60060160ed741a007811dced5c5bde11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:21 fohapesx13 vmkernel: 31:23:37:13.362 cpu6:13203)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410002253780) to NMP device "naa.60060160ed741a0024bd2deeffa0dd11" failed on physical path "vmhba1:C0:T0:L66" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:21 fohapesx13 vmkernel: 31:23:37:13.362 cpu6:13203)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a0024bd2deeffa0dd11" state in doubt; requested fast path state update...

    Jul 27 14:01:21 fohapesx13 vmkernel: 31:23:37:13.362 cpu6:13203)ScsiDeviceIO: 747: Command 0x2a to device "naa.60060160ed741a0024bd2deeffa0dd11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:33 fohapesx13 vmkernel: 31:23:37:25.461 cpu6:14123)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x41000219fac0) to NMP device "naa.60060160ed741a004c71b05d5662de11" failed on physical path "vmhba1:C0:T0:L11" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:33 fohapesx13 vmkernel: 31:23:37:25.461 cpu6:14123)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a004c71b05d5662de11" state in doubt; requested fast path state update...

    Jul 27 14:01:33 fohapesx13 vmkernel: 31:23:37:25.461 cpu6:14123)ScsiDeviceIO: 747: Command 0x2a to device "naa.60060160ed741a004c71b05d5662de11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:35 fohapesx13 vmkernel: 31:23:37:27.674 cpu6:13355)NMP: nmp_CompleteCommandForPath: Command 0x28 (0x41000202f600) to NMP device "naa.60060160ed741a00c2135fabd30cde11" failed on physical path "vmhba1:C0:T0:L53" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:01:35 fohapesx13 vmkernel: 31:23:37:27.674 cpu6:13355)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a00c2135fabd30cde11" state in doubt; requested fast path state update...

    Jul 27 14:01:35 fohapesx13 vmkernel: 31:23:37:27.674 cpu6:13355)ScsiDeviceIO: 747: Command 0x28 to device "naa.60060160ed741a00c2135fabd30cde11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:02:57 fohapesx13 vmkernel: 31:23:38:49.246 cpu7:11283)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x41000211dc40) to NMP device "naa.60060160ed741a00caeba6dd155cde11" failed on physical path "vmhba1:C0:T1:L10" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:02:57 fohapesx13 vmkernel: 31:23:38:49.246 cpu7:11283)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a00caeba6dd155cde11" state in doubt; requested fast path state update...

    Jul 27 14:02:57 fohapesx13 vmkernel: 31:23:38:49.246 cpu7:11283)ScsiDeviceIO: 747: Command 0x2a to device "naa.60060160ed741a00caeba6dd155cde11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:03:05 fohapesx13 vmkernel: 31:23:38:57.970 cpu7:4103)NMP: nmp_CompleteCommandForPath: Command 0x25 (0x4100021e0540) to NMP device "naa.60060160ed741a005281c721ef28de11" failed on physical path "vmhba1:C0:T1:L56" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Jul 27 14:03:05 fohapesx13 vmkernel: 31:23:38:57.970 cpu7:4103)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160ed741a005281c721ef28de11" state in doubt; requested fast path state update...

    Is anyone else booting ESX locally, attached to a CLARiiON, not using LUN 0, and round robin?

    Thanks.



  • 44.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 28, 2009 10:31 AM

    I have the same issue here in my production environment after upgrading from VMware ESX 3.5-U4 to VMware ESX 4 Build 175625.

    Our hardware setup:

    - 2 Dell PowerEdge 2850 servers upgraded to ESX 4.0

    - 2 QLA2340 2Gb FC HBA

    - 2 Dell PowerEdge 2950 servers upgraded to ESX 4.0

    - 2 ISP2432 4GB FC HBA

    - 3 Dell PowerEdge 2950 III serverswith ESXi 3.5U4 Embedded

    - 2 ISP2432 4GB FC HBA

    connected to the 2 Storage systems:

    - Datacore Storage Server SANMelody

    - EMC Clariion CX300

    The each SAN Storage is connected to the two FC-switches, with one path per switch. Every server is connected with one path per switch to the switches. The VMware ESX detects two paths to every LUN on the each SAN Storage. The path policy in ESX is set to "Fixed".

    The /var/log/vmkernell on all ESX 4.0 hosts have warnings like this:

    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.638 cpu3:4188)<6>qla2xxx 0000:0e:00.0: scsi(6:3:3): Abort command issued -- 1 24b
    7d5e 2002.
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4099)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410005198900) to N
    MP device "naa.60030d90564d2d4e564d340000000000" failed on physical path "vmhba2:C0:T3:L3" H:0x8 D:0x0 P:0x0 Possible sense
    data: 0x0 0x0 0x0.
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4099)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.600
    30d90564d2d4e564d340000000000" state in doubt; requested fast path state update...
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4099)ScsiDeviceIO: 747: Command 0x2a to device "naa.60030d90564d2d4e564d
    340000000000" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4188)<6>qla2xxx 0000:0e:00.0: scsi(6:3:5): Abort command issued -- 1 24b
    7d59 2002.
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4099)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x41000507c240) to N
    MP device "naa.60030d90564d2d4e564d360000000000" failed on physical path "vmhba2:C0:T3:L5" H:0x8 D:0x0 P:0x0 Possible sense
    data: 0x0 0x0 0x0.
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4099)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.600
    30d90564d2d4e564d360000000000" state in doubt; requested fast path state update...
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4099)ScsiDeviceIO: 747: Command 0x2a to device "naa.60030d90564d2d4e564d
    360000000000" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4188)<6>qla2xxx 0000:0e:00.0: scsi(6:3:7): Abort command issued -- 1 24b
    7d5c 2002.
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4099)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410005029bc0) to N
    MP device "naa.60030d90564d2d4e564d380000000000" failed on physical path "vmhba2:C0:T3:L7" H:0x8 D:0x0 P:0x0 Possible sense
    data: 0x0 0x0 0x0.
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4099)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.600
    30d90564d2d4e564d380000000000" state in doubt; requested fast path state update...
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.639 cpu3:4099)ScsiDeviceIO: 747: Command 0x2a to device "naa.60030d90564d2d4e564d
    380000000000" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.640 cpu3:4188)<6>qla2xxx 0000:0e:00.0: scsi(6:3:4): Abort command issued -- 1 24b

    7d5d 2002.

    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.640 cpu3:4099)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410005196600) to N

    MP device "naa.60030d90564d2d4e564d350000000000" failed on physical path "vmhba2:C0:T3:L4" H:0x8 D:0x0 P:0x0 Possible sense

    data: 0x0 0x0 0x0.

    Jul 28 10:31:12 xxxx vmkernel: 6:23:58:54.640 cpu3:4099)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.600

    30d90564d2d4e564d350000000000" state in doubt; requested fast path state update...

    Jul 28 10:31:12 vmkd vmkernel: 6:23:58:54.640 cpu3:4099)ScsiDeviceIO: 747: Command 0x2a to device "naa.60030d90564d2d4e564d

    350000000000" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    -


    These warnings don't appear on our ESXi 3 hosts. These warnings are reported frequently on multiple lun's, not only on lun 0, but only Datacore Storage Server SANMelody lun's affected. Also the ESX 4.0 hosts got slow.

    Any ideas how to resolve this issue?

    Best Regards,

    Dmitri



  • 45.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 28, 2009 04:16 PM

    Opened a SR with EMC and they see no issues.

    We are going to set up a new cluster and see if we can reproduce our issue. We suspect it has something to do with LUN 0, but we cannot confirm.

    EMC CX3-80, local install, no LUN0, and using round robin. Anyone else with a similar config?



  • 46.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jul 28, 2009 06:06 PM

    My hardware setup is as following:

    ESX cluster based on HP Blade Infrastructure:

    • HP BL460c G1 servers and HP BL460c G6 servers

    • 2 QLA 4Gb FC HBA per server

    Connected to HP Storage system:

    • HP EVA 6000 XP Array

    The SAN Storage is connected to the two FC-switches, with one path per switch. Every server is connected with one path per switch to the switches. The VMware ESX detects two paths to every LUN on the each SAN Storage. The path policy in ESX is set to "Fixed"

    On the ESXi 4 servers the storage array controller is detected as lun 0 and recognized by ESXi 4 as type 'Array'. I am currently building a new VMware cluster to see if the warnings still do appear.



  • 47.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Aug 12, 2009 12:44 AM

    Hi

    Our system have same problem:

    - HP BL680

    - Storage Hitachi

    It takes long time when rescan adapter and resolve VMFS



  • 48.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Sep 14, 2009 09:39 PM

    We have the same issues:

    Sep 11 14:49:48 esx5 vmkernel: 0:15:30:14.830 cpu15:4111)NMP: nmp_CompleteCommandForPath: Command 0x28 (0x41000b04cd80) to NMP device "naa.60a9800050334c646c4a50744f547334" failed on physical path "vmhba5:C0:T3:L0" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    Sep 11 14:49:48 esx5 vmkernel: 0:15:30:14.830 cpu15:4111)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60a9800050334c646c4a50744f547334" state in doubt; requested fast path state update...

    Sep 11 14:49:48 esx5 vmkernel: 0:15:30:14.830 cpu15:4111)ScsiDeviceIO: 747: Command 0x28 to device "naa.60a9800050334c646c4a50744f547334" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x3f 0xe.

    We also had a VMFS failure on 1 directory (1 VM) - that no longer can be accessed. Say VMFS lock corrupt. This could be triggered by the above errors. We had no issues before upgrading to ESX4.

    Any who can help here ? - What should I do...

    Henrik



  • 49.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Sep 14, 2009 09:39 PM

    Vielen Dank für Ihre E-Mail. Ich bin bis zum 21. September nicht im Büro. In dringenden Fällen wenden Sie sich bitte an unseren Hepldesk oder an Herrn Haselau Tel. 0541 9493-157

    Thank you for E-Mail. I am currently out of the office and will return your message when I return September 21th, 2009. If you require immediate assistance, please email our Helpdesk or call at +49(0)41-9493-157

    Mit freundlichen Grüßen

    i.A. Stefan Ohlmeyer

    Stellv. Leiter Technik

    SIEVERS-SNC

    Computer & Software GmbH & Co. KG

    Ein Unternehmen der SIEVERS-GROUP

    Hans-Wunderlich-Straße 8

    49078 Osnabrück

    Fon: +49 (541) 9493-160

    Fax: +49 (541) 9493-260

    E-Mail: sohlmeyer@sievers-group.com

    Web: www.sievers-group.com

    Pers. haftende Gesellschafterin:

    SIEVERS-SNC Beteiligungs GmbH

    Amtsgericht Osnabrück, HRB 19289

    Geschäftsführer:

    Dipl.-Kfm. Klaus Gerdes-Röben

    Software-Ing. Marco Naber

    Dipl.-Wirtschaftsing. Rüdiger Sievers

    Amtsgericht Osnabrück, HRA 6465



  • 50.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Sep 30, 2009 09:27 AM

    Hello,

    i have the same problem with ESXi 4. Our System:

    DELL PowerEdge 1800 with SATA CERC Raid Controller, 6 x 160 GB SATA, RAID 10.

    1 x Machine LSI Logic, 2 Drives

    1 x Machine Buslogig, 1 Drive

    1 x Machine Buslogix, 1 Drive

    1 x Machine IDE, 1 Drive

    When starting another Image with LSI Logic SCSI the System hangs. I am trying now to use another machine with Buslogic SCSI. Can someone test or have a look if it could be a problem with the used SCSI Controllers?

    Regards

    Rainer Raebiger



  • 51.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Nov 03, 2009 10:35 AM

    Hi all,

    we changed all SCSI-Controllers from LSI to BusLogic. Since then we have no more problems.

    Regards

    Rainer Raebiger



  • 52.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Dec 01, 2009 06:17 PM

    Wrote Virtualgeek blog posts about a vSphere 4 (and vSphere 4u1) condition that can create this state, and two workarounds.

    Not saying it is the root cause of the above noted cases, but to me it looks like it.

    VMs (obviously those NOT on the lost datastore) becoming intermittently inaccessible if an APD (all paths dead) state is detected is a known issue. Commonly this can be triggered by yanking LUNs before removing datastores and ESX devices, or storage or FC/FCoE/iSCSI network issues.

    You can see the post and workarounds here:

    http://virtualgeek.typepad.com/virtual_geek/2009/12/an-important-vsphere-4-storage-bug-and-workaround.html

    Chad Sakac, P.Eng. vExpert

    EMC Corp

    VP, VMware Technology Alliance



  • 53.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Dec 01, 2009 10:14 PM

    This issue has been driving me crazy. How do you cleanly 'remove' a data store. I would do it the 'correct' way if I knew how!



  • 54.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jun 15, 2011 11:56 AM

    same problem here, strange thing is it's only on one LUN (out of 15)

    i'm connecting from 4+2 ESX hosts (2 clusters)

    4x DL585 G5

    2x DL380 G5

    connecting to MCdata 4700

    Hitachi (HDS) AMS500

    Hitachi (HDS) AMS2100

    I'm also only seeing the problem on 1 host (DL585 G5),

    i've checked my SAN PATH's and no problem on that end.

    Jun 15 13:54:16 bumblebee vmkernel: 36:03:58:27.856 cpu4:8927)ScsiDeviceIO: 1672: Command 0xfe to device "t10.HITACHI_999999990004" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
    Jun 15 13:54:26 bumblebee vmkernel: 36:03:58:37.544 cpu4:10255)ScsiDeviceIO: 1672: Command 0xfe to device "t10.HITACHI_999999990004" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
    Jun 15 13:54:36 bumblebee vmkernel: 36:03:58:48.310 cpu15:5255)ScsiDeviceIO: 1672: Command 0xfe to device "t10.HITACHI_999999990004" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
    Jun 15 13:54:47 bumblebee vmkernel: 36:03:58:58.429 cpu3:8022)ScsiDeviceIO: 1672: Command 0xfe to device "t10.HITACHI_999999990004" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
    Jun 15 13:54:56 bumblebee vmkernel: 36:03:59:07.834 cpu3:8020)ScsiDeviceIO: 1672: Command 0xfe to device "t10.HITACHI_999999990004" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
    Jun 15 13:55:06 bumblebee vmkernel: 36:03:59:17.601 cpu3:4099)ScsiDeviceIO: 1672: Command 0xfe to device "t10.HITACHI_999999990004" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
    Jun 15 13:55:17 bumblebee vmkernel: 36:03:59:28.618 cpu3:4099)ScsiDeviceIO: 1672: Command 0xfe to device "t10.HITACHI_999999990004" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
    Jun 15 13:55:28 bumblebee vmkernel: 36:03:59:39.702 cpu3:4099)ScsiDeviceIO: 1672: Command 0xfe to device "t10.HITACHI_999999990004" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
    Jun 15 13:55:36 bumblebee vmkernel: 36:03:59:48.209 cpu3:8746)ScsiDeviceIO: 1672: Command 0xfe to device "t10.HITACHI_999999990004" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
    Jun 15 13:55:45 bumblebee vmkernel: 36:03:59:57.114 cpu3:8020)ScsiDeviceIO: 1672: Command 0xfe to device "t10.HITACHI_999999990004" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.


  • 55.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jun 15, 2011 01:59 PM

    I would make certain you aren't use LUSE devices on that HDS array. Are you replicating that LUN by any chance? You could very well have a bad HBA and not really know it without doing some deep troubleshooting.



  • 56.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jun 17, 2011 07:19 AM

    No I don't use LUSE, but the problem went away after my running clone was complete.

    I think it was a problem with my cache filling up due to slow SATA disks.



  • 57.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jun 21, 2011 07:52 AM

    We're seeing yet another hex code for this log message:

    Jun 21 00:56:22 pn003 vmkernel: 152:07:45:21.178 cpu4:4100)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410008036180) to NMP device "naa.600c0ff000da7b197d2d794b01000000" failed on physical path "vmhba1:C0:T6:L0" H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    so the code would be:

    H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    the host = 8 message translates to :

    SG_ERR_DID_RESET [0x08] The SCSI bus (or this device) has been reset. Any SCSI device on a SCSI bus is capable of instigating a reset.

    Out setup is;

    HP DL380 (4x) G5, (3x) G6 and (2x) G7.

    Storage is MSA2312fc (with SAS disks)

    Switches are Cisco MDS 9124

    ESX 4.0.0 332073 (reason why we're not on ESX4.1 is because of the 64bit requirement of vSphere, still working on that one).

    This message is repeated over multiple hosts, but not on all hosts.

    We also have other MSA datastores of which none of them are showing these messages.

    I am having a bit of trouble finding some usable counters on the MSA. The web gui isn't helpfull at all.. the commandline is not really self-explanatory i'm affraid. Is there a simple command for checking the performance counters on MDS cli?

    The fiberchannel switches are showing no errors or congestion.

    We do however perform nightly incremental backups of our vmware guests using the legacy method with Tivoli.



  • 58.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Jun 21, 2011 11:23 AM

    I have seen this on several EVA8100 arrays and on ESX4.0/4.1 HP G4/G5/G6 blades.

    sense codes: H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0

    What we did was set the following value's

    DiskQfullSampleSize => 32

    QFullThreshold ==> 8

    DiskmaxIOSize ==> 128kb

    Changed the access method from MRU to RR instead.

    Source : http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=110&prodSeriesId=3664583&prodTypeId=12169&objectID=c02697105

    Upgraded the VC Fibre channel modules and the blades HBA's to the latest firmware versions since then it hasn't reported the sense codes anymore.



  • 59.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Dec 10, 2012 02:34 PM

    Hi,

    It's been a long time that this post wasn't updated, but recently we had the same problem with esx hosts and ibm storage array. We uses Esxi 5 update 1 in our environement.There is multiple hosts who access shared datastores between 2 sites.

    All vms in a cluster became unresponsive, hosts esx became disconnected and we've seen some errors on vmkernel and others logs :


    2012-11-14T13:21:58.148Z cpu12:4619516)ScsiDeviceIO: 2322: Cmd(0x412440da3800) 0x9e, CmdSN 0x55510f from world 0 to dev "" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.

    012-12-05T16:40:15.020Z cpu3:8195)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba2:C0:T0:L22, reservation state on device XXX is unknown.
    2012-12-05T16:40:15.023Z cpu12:8204)NMP: nmp_PathDetermineFailure:2084: SCSI cmd RESERVE failed on path vmhba6:C0:T0:L12, reservation state on device XXX is unknown.

    To resolve this problem we have to do a reboot of the esxi host, we have try to restart the management agent without success.

    Anyone has experienced this problem ? Solved this problem ?

    We also opened a SR with vmware.

    Thanks for your help !



  • 60.  RE: WARNING: NMP: nmp_DeviceRequestFastDeviceProbe

    Posted Dec 27, 2012 07:04 PM

    I might have some helpful information here. I have a lab setup where I have completely broken various things a lot during the learning curve and through carelessness. I ran into this particular issue today after I moved my top-level openfiler VM's IP Storage vNIC (using VT-d to present pass-through volume sets from my areca controller to hosts)  onto another vSwitch Port Group in the same VLAN on its host. I was getting all kinds of errors related to this thread. The Hypervisor for the host was locking up, the iSCSI devices and datastores were flapping, the VMs were in unknown status, and when I could get info from some of the datastores, or when I tried to re-add them, the wizard said they were empty. Multiple reboots of the host and filer did nothing.

    I realized that the switch I moved the vNIC into did not have jumbo frames enabled, but I'm gathering that if jumbo frames are suddenly disabled anywhere in the network loop that this might happen. I have no clue whether an update would affect the jumbo frames setting on vSwitches. In any case, it seems feasible that an upgrade/update might do something to muck up the VM Kernel ports or Port Groups related to the initiator or IP storage virtual network. Here is what I did to fix my scenario...

    =========================

    First, stabilize the host(s):

    1. Stop iSCSI target service on filer.

    2. Remove "unknown" guests from the affected host(s).  The logs should stop going nuts, but for me the vSphere client  was still very slow, so...
    3. Reboot ESX host(s).

    4. Unmap the LUNs from the target(s) on the filer. (I had to create entirely new targets as part of the process)
    5. Make sure jumbo frames are turned on in the vSS/vDS at the switch level, port group and/or VM Kernel port for the initiator or filer. Of course this is only relevant if you have jumbo frames enabled on the filer and physical switch(es), which is what I assume.

    6. Create a NEW target on the filer and map a LUN to  it, allow one ESX host in the ACL, and start the iSCSI target service.
    7. Rescan  the HBA on the host. If this ultimately doesn't work then I would start over and  nuke/pave the switch, PG, VMK, etc. if not done already.

    ###I've performed several types of screw-ups with the iSCSI HBAs where the entire HBA/switch setup needed to be nuked and paved. If this process doesn't work, try removing the VM Kernel port(s) from the initiator(s) and removing the switches and creating them again with the relevant port group(s)/VM Kernel Port(s). Make sure jumbo frames are enabled everywhere relevant. Switch level, PG level, VMK level. Then add the new VM Kernel port(s) back to the initiator(s). All I can gather is that when something goes really bad the OS doesn't know how to deal with the existing devices or targets anymore.###

    6. If the HBA devices show up normally again, check the datastores. All of mine but one out of 6 were not present and had to be re-added. That one showed up as "unknown (unmounted)". I tried to mount it and got an error, but then it mounted. It was probably already mounting, I guess. For the ones that I added back, I chose "Keep existing signature" in the wizard. I don't know what creating a new signature could ultimately affect, but it didn't seem like the right choice because I think you only need to resignature a copied datastore.

    I added one LUN at a time to the target and brought all 6 datastores back online successfully without any data loss, ending my streak of a half-dozen irreparable catasrophes. I hope this helps.