9 Replies Latest reply on Feb 4, 2013 12:15 PM by GFK

    VMs very slow on IBM x3650 M4 with ESXi 5.1

    Gabrie Master
    vExpert

      Hi

      For a customer I just installed a vSphere 5.1 essentials environment. Two IBM x3650M4 hosts connected over ONE path to an IBM DS3512 storage. Soon a second HBA will be added to the configuration and we'll have two paths.

       

      Before installing I checked the BIOS versions of the ESXi hosts. According to the VMware HCL, the BIOS version should be  IBM-[VVE112H]. According to the IMM, the hosts are running

      UEFI (Active)    1.21    VVE120EUS    11 Oct 2012

       

      The problem we're experiencing is that all disk actions are very slow. In the vmkernel.log I can see a lot of path failovers that would explain why the system is slow. (See logs at end of this post).

       

      When searching for these errors, I found this community post: http://communities.vmware.com/thread/341512

      They refer to this VMware KB http://kb.vmware.com/kb/1030265

       

      In that KB there is the following note:

       

      "Note: This issue only applies if you see this specific alert in the vmkernel/messages log files: ALERT: APIC: 1823: APICID 0x00000000 - ESR = 0x40. If you do not see this message, you are not experiencing this issue."

       

       

      Questions:

      - We can't find that message (ALERT: APIC: 1823: APICID 0x00000000 - ESR = 0x40) in the vmkernel logs. Is this enough to state this KB does not apply?

      - What is the impact if we DO apply the recommended solution on the ESXi 5.1 hosts?

       

      Regards

      Gabrie

       

       

       

      Below is the vmkernel.log:

       

      2013-02-01T17:05:30.373Z cpu22:8214)NMP: nmpCompleteRetryForPath:321: Retry world recovered device "naa.60080e50002fb706000002635107aa63"
      2013-02-01T17:05:30.796Z cpu15:8374)NMP: nmp_DeviceUpdatePathStates:615: Activated path "vmhba2:C0:T0:L1" for NMP device "naa.60080e50002f7b320000028c5107a9ff".
      2013-02-01T17:05:30.797Z cpu0:10554)WARNING: NMP: nmpDeviceAttemptFailover:599:Retry world failover device "naa.60080e50002f7b320000028c5107a9ff" - issuing command 0x412401f91300
      2013-02-01T17:05:30.797Z cpu10:13686)NMP: nmpCompleteRetryForPath:321: Retry world recovered device "naa.60080e50002f7b320000028c5107a9ff"
      2013-02-01T17:05:32.520Z cpu20:217782)VMW_SATP_LSI: satp_lsi_pathFailure:1120: Command 0x8a to naa.60080e50002fb706000002635107aa63 (fcf 0) failed with NOT_READY (0x2/0x4/0x1), on path vmhba2:C0:T0:L2 (pnr 1, iet 0xac848a2)
      2013-02-01T17:05:32.520Z cpu20:217782)ScsiDeviceIO: 2303: Cmd(0x4124403dcb00) 0x8a, CmdSN 0x8000002c from world 217778 to dev "naa.60080e50002fb706000002635107aa63" failed H:0x0 D:0x2 P:0x4 Possible sense data: 0x2 0x4 0x1.
      2013-02-01T17:05:32.520Z cpu22:8214)ScsiDeviceIO: 2303: Cmd(0x4124403f8200) 0x8a, CmdSN 0x80000041 from world 217778 to dev "naa.60080e50002fb706000002635107aa63" failed H:0x0 D:0x2 P:0x4 Possible sense data: 0x2 0x4 0x1.
      2013-02-01T17:05:33.039Z cpu22:8214)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x8a (0x4124403dcb00, 217778) to dev "naa.60080e50002fb706000002635107aa63" on path "vmhba2:C0:T0:L2" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x94 0x1. Act:FAILOVER
      2013-02-01T17:05:33.039Z cpu22:8214)WARNING: NMP: nmp_DeviceRetryCommand:133:Device "naa.60080e50002fb706000002635107aa63": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
      2013-02-01T17:05:33.039Z cpu22:8214)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x8a (0x4124403f8200, 217778) to dev "naa.60080e50002fb706000002635107aa63" on path "vmhba2:C0:T0:L2" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x94 0x1. Act:FAILOVER
      2013-02-01T17:05:33.039Z cpu22:8214)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x8a (0x4124425c4e40, 217778) to dev "naa.60080e50002fb706000002635107aa63" on path "vmhba2:C0:T0:L2" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x94 0x1. Act:FAILOVER
      2013-02-01T17:05:33.306Z cpu3:8195)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x2a (0x412402b82c00, 11035) to dev "naa.60080e50002f7b320000028c5107a9ff" on path "vmhba2:C0:T0:L1" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x94 0x1. Act:FAILOVER
      2013-02-01T17:05:33.306Z cpu3:8195)WARNING: NMP: nmp_DeviceRetryCommand:133:Device "naa.60080e50002f7b320000028c5107a9ff": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
      2013-02-01T17:05:34.496Z cpu3:8375)NMP: nmp_DeviceUpdatePathStates:615: Activated path "vmhba2:C0:T0:L2" for NMP device "naa.60080e50002fb706000002635107aa63".
      2013-02-01T17:05:34.497Z cpu1:8786)WARNING: NMP: nmpDeviceAttemptFailover:599:Retry world failover device "naa.60080e50002fb706000002635107aa63" - issuing command 0x4124403dcb00
      2013-02-01T17:05:34.506Z cpu22:8214)NMP: nmpCompleteRetryForPath:321: Retry world recovered device "naa.60080e50002fb706000002635107aa63"
      2013-02-01T17:05:34.795Z cpu2:8369)NMP: nmp_DeviceUpdatePathStates:615: Activated path "vmhba2:C0:T0:L1" for NMP device "naa.60080e50002f7b320000028c5107a9ff".
      2013-02-01T17:05:34.796Z cpu0:10554)WARNING: NMP: nmpDeviceAttemptFailover:599:Retry world failover device "naa.60080e50002f7b320000028c5107a9ff" - issuing command 0x412402b82c00
      2013-02-01T17:05:34.796Z cpu4:8634)NMP: nmpCompleteRetryForPath:321: Retry world recovered device "naa.60080e50002f7b320000028c5107a9ff"
      ~ #
        • 1. Re: VMs very slow on IBM x3650 M4 with ESXi 5.1
          ragmon Enthusiast

          Hi,

           

          What HBAs are you using? Are you using IBM provided drivers or VMware in-box drivers?

          • 2. Re: VMs very slow on IBM x3650 M4 with ESXi 5.1
            Gabrie Master
            vExpert

            vmhba2  mpt2sas           link-n/a  sas.500605b005665230                    (0:27:0.0) LSI Logic / Symbios Logic LSI2008

             

            # vmkload_mod -s mpt2sas

            vmkload_mod module information

            input file: /usr/lib/vmware/vmkmod/mpt2sas

            Version: Version 10.00.00.00.5vmw, Build: 799733, Interface: 9.2 Built on: Aug  1 2012

            License: GPL

            Required name-spaces:

              com.vmware.driverAPI#9.2.1.0

              com.vmware.vmkapi#v2_1_0_0

            Parameters:

              heap_max: int

                Maximum attainable heap size for the driver.

              heap_initial: int

                Initial heap size allocated for the driver.

              max_sectors: short

                max sectors, range 64 to 8192 default=8192

              max_lun: int

                 max lun, default=16895

              command_retry_count: int

                 Device discovery TUR command retry count: (default=144)

              logging_level: int

                 bits for enabling additional logging info (default=0)

              mpt2sas_raid_queue_depth: int

                 Max RAID Device Queue Depth (default=128)

              mpt2sas_sata_queue_depth: int

                 Max SATA Device Queue Depth (default=32)

              mpt2sas_sas_queue_depth: int

                 Max SAS Device Queue Depth (default=254)

              disable_discovery: int

                 disable discovery

              mpt2sas_fwfault_debug: int

                 enable detection of firmware fault and halt firmware - (default=0)

              diag_buffer_enable: int

                 post diag buffers (TRACE=1/SNAPSHOT=2/EXTENDED=4/default=0)

              missing_delay: array of int

                 device missing delay , io missing delay

              msix_disable: int

                 disable msix routed interrupts (default=0)

              max_sgl_entries: int

                 max sg entries

              max_queue_depth: int

                 max controller queue depth (default=600)

            • 3. Re: VMs very slow on IBM x3650 M4 with ESXi 5.1
              Gabrie Master
              vExpert

              IBM Support told us the single HBA connected directly to the DS3512 is not the correct configuration. There should be a switch in between them. We've decided to try the iSCSI route.

              • 4. Re: VMs very slow on IBM x3650 M4 with ESXi 5.1
                gkn@ac Novice

                Hi, I've put some of these boxes, without problems (and no switch when using SAS). Just know that IBM dose not support DS35xx with firmware 7.83... only 7.84.xx or greater....when using vSphere 5.1. Also have a look at this KB from vmware kb.vmware.com/kb/2039608

                • 5. Re: VMs very slow on IBM x3650 M4 with ESXi 5.1
                  GFK Enthusiast

                  Hi, I've put some of these boxes, without problems (and no switch when  using SAS). Just know that IBM dose not support DS35xx with firmware  7.83... only 7.84.xx or greater....when using vSphere 5.1. Also have a  look at this KB from vmware kb.vmware.com/kb/2039608

                   

                  Sorry for the double post, was on with my old account.

                  • 6. Re: VMs very slow on IBM x3650 M4 with ESXi 5.1
                    Gabrie Master
                    vExpert

                    Hi

                     

                    It is on firmware 7.84, I checked this with the VMware HCL.

                     

                    Can you explain how you connected the box? Currently each host has ONE hba connected over SAS to the DS3512. After connecting the SATP is showing LSI and the PSP is MRU, where the VMware documentation suggests using ALUA, but the supplier of the DS3512 didn't know how to enable this on the LUNs.

                     

                    Gabrie


                    • 7. Re: VMs very slow on IBM x3650 M4 with ESXi 5.1
                      GFK Enthusiast

                      I assume that you have created a hotsgroup the DS3512? then you can add a host to this group. This is also where you select the OS this is VMWARE and not ALUA. Physical put you'r hbaport1 in ds3512 controller port 1 and so on, do one at a time, so you do not mixup the WWN.

                      1 person found this helpful
                      • 8. Re: VMs very slow on IBM x3650 M4 with ESXi 5.1
                        Gabrie Master
                        vExpert

                        Think we found the problem. Host1 was connected through SAS on Storage Processer A and Host2 connected to Storage Processor B. It seems it works like with EMC storage and the LUNs kept thresspassing. Once we connected host2 to Storage Processer A the performance was back and there were no more messages in the vmkernel log.

                         

                        Stupid we didn't think of this before because I usually do check this with EMC storage. Just not used to IBM SAS Storage.

                         

                        Thank you for your help.

                        • 9. Re: VMs very slow on IBM x3650 M4 with ESXi 5.1
                          GFK Enthusiast

                          Great you got it working Normally I setup host1 HBA1 to controller A port 1 and host1 HBA1 to controller B port1 and it work fine with no errors.