1 2 Previous Next 22 Replies Latest reply on Mar 16, 2011 8:19 AM by DaIceMan

    MSA2312i and ESX4.1 slow performance

    Syl20m Novice

      Hi all,

       

      I have a big problem of performance with my customer configuration.

      In fact we experienced very slow large files transfer rate from windows computers to windows servers VM: max transfer rate for a 2GB file is 15MB/s. I assume we must expect over 100MB/s transfer rate

       

      We used an MSA2312i with 7 disks of 450GB 15k in RAID5, 2 HP DL360g7 servers with 12 NICS as ESX4.1 servers. For having a fault tolerance, we used 2 dedicated HPprocurve 1810G for connecting the 2 iSCSI NICS of ESX's and the 2 controllers of the MSA.

      On the MSA we have 2 vdisks (1 of 2TB and 1 of 700GB)

      Our 6 VMs are running on the 2TB VMFS datastore.

       

      I made a lot of tests that puzzles me...

      _ configuring software iSCSi : no change

      _ using another switch: dlink DGS-3100: no change

      _ using Write-back or Write-through caching on the MSA volume: no change

      _ changing mutltipathing to round robin: no change

      _ changing NICS used for software iSCSI: no change

      _ installing another ESX server (different hardware) in VMWARE ESX4.0, migrating a VM to this host: no change

      _ migrating a VM in the local ESX datastore: WHAOOOOO!!! transfer rate between 50MB/s and 80MB/s

      _ configuring a server Windows 2008 ( physical) with software initiator on the MSA2312i using the same switch than ESXs. I formatted the 700GB vdisk from VMFS to NTFS in order to show it on the server: WONDERFUL, I copy a 2GB file at 80M/s in less than a minute!!!

       

      With those tests, I know that MSA is good because in windows iSCSi environnment I have no problem. But I can't explain why in Vmware environment I experienced a such poor performance?!

       

      Any help or any idea on this case will be really appreciated!

      For information, my VMWARE case is opened since 24/12/2010!!! and not closed at this time.

      An HP case was closed since a month because of our test with the windows server 2008!!!

       

      I'll give you any further information if needed!

       

      Thanks in advance,

      Sylvain

        • 1. Re: MSA2312i and ESX4.1 slow performance
          AndreTheGiant Guru
          vExpert

          Welcome to the community.

           

          Have you enabled jumbo frames?

          Have you follow HP recomended practices for configuring the environment?

           

          Andre

          • 2. Re: MSA2312i and ESX4.1 slow performance
            Syl20m Novice

            Hello Andre,

             

            Thanks for your quick response,

             

            In fact, I forgot this test in my post, but there is no change using Jumbos or not. I heard that jumbos could increase performances by 5-10% but my performances are so poor that I don't see any improvement.

             

            To configure iSCSI, I used Vmware iSCSI SAN configuration guide: www.vmware.com/pdf/vsphere4/r40/vsp_40_iscsi_san_cfg.pdf

            And Hp Storageworks MSA Best Practice: http://h20195.www2.hp.com/V2/GetPDF.aspx/4AA2-5019ENW.pdf

            • 3. Re: MSA2312i and ESX4.1 slow performance
              AndreTheGiant Guru
              vExpert

              Have you tried also to use a guest initiator inside a VM? Just to see if you reach similar performance as the physical Windows Server case?

              MSA firmware is up to date?

               

              Andre

              1 person found this helpful
              • 4. Re: MSA2312i and ESX4.1 slow performance
                depping Champion
                User ModeratorsVMware Employees

                Did you check with esxtop if there are any latency issues? DAVG? KAVG? LAT/rw and LATrd? What about QUED?

                 

                Duncan (VCDX)

                Available now on Amazon: vSphere 4.1 HA and DRS technical deepdive

                • 5. Re: MSA2312i and ESX4.1 slow performance
                  Syl20m Novice

                  No, I didn't try that! That's a good idea, I'll made this test afternoon.

                  I have also added a new disk in the MSA in order to create another vdisk in RAID0. I'll see if I have better performance in RAID 0 than in RAID5.

                   

                  Sylvain

                  • 6. Re: MSA2312i and ESX4.1 slow performance
                    Syl20m Novice

                    Yes, I can see with esxtop that the DAVG is high (between 10 and 100ms), GAVG and KAVG are all the time lower than 10ms (average = 1ms). Other values seems to be good except MBWRTN/s that is still very very low!

                     

                    Sylvain

                    • 7. Re: MSA2312i and ESX4.1 slow performance
                      DaIceMan Enthusiast

                      Sylvain,

                           did you find any solution to this? We are having similar latency problems with oru ESX 4.1. We have and MSA2312i dual controller with 6 x 1TB SATA disks and one vdisk split into 4x 16GB LUNS to boot our 4 diskless blades with QMH4062 iSCSI HBAs and the rest split in 2 datastores/LUNs. We have a total of 8 VMs on one datastore and 4 on the other. While storage vMotioning a VM between the two we can get on one host from 100-300 DAVG/cmd with about 60-100 GAVG - similarly while copying any file. When this happens the whole vmware infrastructure becomes sluggish and sometimes unresponsive. It appears to be a SAN issue but as like you, when we connected a physical client to a LUN, the copy speeds reached 80MB/s. We also followed the documentation for configuring the SAN with ESX. We have 2 seperate subnets for each port group (A1B1 / A2B2) cross connected to 2 2810Procurve switches. We tried setting round robin or back to MRU but we noticed not changes. Typically, the copy can start the first seconds even at 40MB/s (especially after a copy restart, due to caching), but then drops drastically to a few hundred KB/s. We also tried enabling jumbo frames everywhere (switches, vswitches, vmknics, and VM vmxnet3 nics, iscsi HBAs as they are independent HW) without any change. There are no particular warnings in the vmkwarning log (we saw some MTU problems initially as we forgot to push the MTU of a vswitch, but that was quickly resolved). The latency problems seemed to actually worsen after this, so we reverted back to 1500 on our VMs, but we left the iSCSI HBA MTU to 9000 with all the networking and MSA Host iface still at 9000.

                       

                      I noticed that in the documents it is stated that to enable jumbo frames on the vmknic(s) it has to be removed and recreated, but I saw that issuing

                       

                      esxcfg-vmknic -m 9000 "VMkernel 2"

                       

                      the command is successfully accepted and is retained between reboots. Also, we have independant hardware iSCSI HBAs which do not make use of vmware networking (in fact Jumbo frames are enabled via their BIOS and a esxcfg-hwiscsi -l vmhba0 and 1 shows this correctly).

                       

                      Would there be anything else we can try to analyse in the logs to see what could be slowing the SAN access so much?

                       

                      Thanks for any feedback.

                      • 8. Re: MSA2312i and ESX4.1 slow performance
                        PimMolenbrugge Lurker

                        Similar problem with lefthand iSCSI nodes, any progress yet sylvain/Dalceman ?

                        • 9. Re: MSA2312i and ESX4.1 slow performance
                          depping Champion
                          User ModeratorsVMware Employees

                          Hi DAVG usually indicates that the delay is at the array side. Not sure if you can monitor the array side, see what it is doing.... but anything regularly above 20ms is suspicious in my opinion

                           

                          Duncan (VCDX)

                          Available now on Amazon: vSphere 4.1 HA and DRS technical deepdive

                          • 10. Re: MSA2312i and ESX4.1 slow performance
                            DaIceMan Enthusiast

                            As a follow up, after some further testing we noticed that the transfer or write speed is about 25MB/s IF the source and target VMs are on different hosts. If they are on the same host, the speeds start at 20 or even 40MB/s but after just a couple of seconds drop to a couple of MB/s and sometimes less than 1.

                            So performance is better with VMs between hosts than on the same host.

                            All hosts are identical hardware (BL490c), with the same QMH4062 HBAs, CPUs and RAM.

                            Our switches are procurve 2810-24  ones. I read that there could be an issue with Jumbo frames and flow control enabled simultaneously though we haven't enabled the latter. I will try disabling Jumbo frames on all iSCSI HBAs, the MSA2312 and 2810s first, then enable flow control on the relevant iSCSI ports and see what happens.

                            • 11. Re: MSA2312i and ESX4.1 slow performance
                              jordan57 Enthusiast

                              Have you looked into the network side of things? Check your duplex settings, maybe force the switch ports and your pNICs to 1000/full or what ever your using.

                              • 12. Re: MSA2312i and ESX4.1 slow performance
                                DaIceMan Enthusiast

                                To bypass any switch related doubts for testing, we connected directly the 4 hosts to the 4 MSA2312i ports using one QMH4062 port per host (2 on one SC and 2 on the other SC) as the MSA presents all LUNs on all ports. The latency and bandwidth problems still persist (not more than sustained 12MB/s) . We also see in the vmkwarning many of these entries when copying from 2 hosts to the storage:

                                 

                                Mar  4 16:45:28 vh5 vmkernel: 0:01:08:31.345 cpu1:4378)WARNING: LinScsi: SCSILinuxQueueCommand: queuecommand failed with status = 0x1055 Host Busy vmhba1:0:3:8 (driver name: qla4xxx) - Message repeated 625 times

                                It can be 50-100-400 and over 600 times, which is definately not a good thing. This simply appears if we copy a file or 2 from between 2 VMs, or just install updates on a server, nothing else is going on.

                                 

                                Could it be an interrupt sharing problem? The BL490c has only 2 mezzanine slots and there is no word about any preference on slots. We also have installed a quad port NIC though I must say that these latency problems appear with or without it installed.

                                 

                                Attached is a dump of the interrupts with a cat /proc/vmware/interrupts.

                                 

                                Thank you for any additional help.

                                • 13. Re: MSA2312i and ESX4.1 slow performance
                                  ats0401 Enthusiast

                                  What kind of speed will you get if you transfer the file from one of the windows machines directly to the datastore (bypassing the VM)?

                                  Is your VM using VMXNET3 driver?

                                  *nevermind*, I see when you move the VM to the host local datastore the issue dissappears. Seems to rule out any vm level networking issues.

                                   

                                  This KB may help you decode the SCSI errors in your log

                                  http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=289902

                                  • 14. Re: MSA2312i and ESX4.1 slow performance
                                    binoche Expert
                                    VMware Employees

                                    no idea what is wrong, could u please upload /var/log/vmkernel when u hit again?

                                    1 2 Previous Next