    Very slow performance on EVA4400

    st3reo

      Hello guys,


      I need some help regarding a issue:


      So I have a setup consisting of a few HP ProLiant DL380 G5 servers ( 2 x Quad-Core Xeon X5470 + 24GB RAM) with dual-port HP FC2242SR  4Gb/s HBAs (running vSphere 4.1) connected to a HP EVA4400 made up of 24 15K 300GB FC disks.

      The servers each have 2 NICs connected to a Cisco 2960G switch and connected to a dvSwitch inside vSphere.


      The disks are grouped in a single DiskGroup on the EVA, and I have a 1040GB Vdisk (LUN) created with RAID 5 level.


      Alright, now I have a VM with Windows Server 2008 R2 acting as a file server. It's virtual hardware consists of 4GB ram and 4 cpu cores and the LUN I mentioned above is mounted as a VMFS datastore to the host and devided up as a 40GB vmdk for the operating system (with LSI SAS controller) and another vmdk of ~1TB on a separate Paravirtual controller ...intended for actual storage of files. Both disks are setup as Thin Provisioning

      And also it has a single VMXNET3 nic.


      The problem is...that the file server performs very bad, it's about 10-12 MB/s for writes and 20-30 MB/s for reads.

      I tested this by transfering  ~2, ~3, and ~4 GB .iso files to and from the server.

      I tried it from my workstation, other VMs & phisical servers...the performance is the same.

      I also vMotioned the file server to other hosts....still same thing.

      I figured maybe it's something wrong with the VM itself....so since I have another windows 2008 R2 VM residing on a smaller (150GB) LUN I did the same test by copying .iso to that..and the performance was the same. And I also did this to a linux VM ....same thing.


      Any ideea what the problem is here....because that just seems extremely poor performance...it's like a lot slower even than a normal PC


      Basicaly there's almost no traffic on the fiber-channel network most of the time...so it's not congestion or anything.

      The EVA and FC-switches are running the latest firmware.


      Please help.


        Re: Very slow performance on EVA4400
          idle-jam

          what bout the cache batter on EVA are they fully charged? or is there any disk failuer that raid rebuilding is being done at the background.

          Re: Very slow performance on EVA4400
            Josh26



            What version of ESXi are you running? 4.1 has some new options around this.


            If you select "manage paths", what does it tell you about the SATP and Path Selection Plugin?

            Re: Very slow performance on EVA4400
              st3reo

              well, I'm not sure how to check the battery status but there aren't any warning lights on it or anything.

              And there haven't been any disk failures. the CommandView interface reports everything as "green".


              Oh and I'm running ESX (4.1).

              I don't have access to it right now but as I recall it shows as SATP_ALUA ....by default path set to MRU but I tried setting it to RoundRobin..and that seemed to increase the speed...but just slightly..basicaly insignificant.

              Re: Very slow performance on EVA4400
                idle-jam

                hmm do you have any chance of having local storage VMFS? i would try on it and see if it's a VM specific issue or an actual storage. From vCenter you could look into the storage latency "performance tab" and you could interpret something and hopefully getting the root cause there.

                Re: Very slow performance on EVA4400
                  J1mbo

                  As an aside I would reduce the vCPU count for the file server VM considerbly (like to 1, and then monitor it).  Ensure vmware tools are installed in the guest and possibly increase it's RAM if it is 64-bit and if it will be busy (again, you can monitor it for now).  Ensure overall the RAM on the host is not over subscribed.


                  As said already, move the VM to local disk to help with isolating the problem, but be sure the local disk has battery-backed write-cache and is set to write-back caching policy.



                  Re: Very slow performance on EVA4400
                    st3reo

                    Thanks for your answers,


                    I do also have local VMFS datastores on the hosts. They each have 6 146GB 10K SAS disks configured into a single RAID5 volume out of which 30GB is used for the ESX install and the rest (~680GB) is a datastore for VMs. The hosts have 512Mb battery-backed controllers.


                    I can't actually move the FileServer VM to the local VMFS because for one it would like take forever....and the block size only permits max 256GB files...and the VM is about 1TB.


                    So I did the same test on another Win 2008 R2 guest that is stored on the local VMFS of a host...and transfer was about ~60 MB/s


                    The thing is...as I said above..It's not only that FileServer VM that has problems....I also tested on others (windows & linux) that also get storage from the EVA SAN...and it's exactly the same slow speed.


                    On the FileServer VM I also tried to copy a file from the C: drive to D: where the storage is...and the speed was the same...about 10 MB/s ....so i suppose that rules out any network issue.


                    The resources aren't oversubscribed...I only have like 1-2 guests on the hosts I am testing. And inside the guests neither RAM or CPU are over utilized or something.

                    Re: Very slow performance on EVA4400
                      st3reo

                      hm, I just noticed that the indicator light on one of the batteries on the controller enclosure keeps blinking, and the manual says that means:

                      Blinking green = Maintenance activity in progress, such as testing or charging


                      but the management interface does not report any kind of problem.

                      Could this be the source of that terrible performance?


                      Unfortunately I just recently created this setup...so I can't compare the performance to anything because I`ve never tested it before


                      *** Edit: I removed the battery and re-inserted it..and it stopped blinking...but no performance improvement.....and after a while the battery started blinking again.

                      Re: Very slow performance on EVA4400
                        idle-jam

                        just as i suspected. sometimes if the battery is not charging you will need to replace it. a new battery will also take few hours to get charged. i would advise getting it replaced asap ..

                        1 person found this helpful
                        Re: Very slow performance on EVA4400
                          st3reo

                          I see.


                          Well I`ll see what I can do, but I might have a hard time getting HP to replace it on warranty since there isn't any actual warning or failure reported on the EVA


                          Thanks a lot,

                          Re: Very slow performance on EVA4400
                            idle-jam

                            i'm not sure but in my country with such case and that it's being slow and the blinking symptom i would be able to have an engineer on site. You can also have the HP rep who sold you the unit to assist in the escalating the support case.


                            Good Luck

                            Re: Very slow performance on EVA4400
                              winetou

                              How about your resource allocation? Maybe you have cpu or memory resource limit for this VM - this may be reason of poor VM (also hard disk) performance...

                              Re: Very slow performance on EVA4400
                                st3reo



                                Interesting, seems similar to my problem...but the thing is I already have installed all the latest firmware on EVA and FC-switches.


                                XCS version:09534000
                                XCS build:CR18CBlep
                                Management firmware:mmp-0001.4200-CR0670



                                This monday...when I got to work, first thing I checked the controller enclosure..and the blinking on the battery stopped. I checked again a few times during the day but I didn't see it blinking again.....which is probably going to make it even harded for me to get it replaced on warranty....and makes me wonder if that even is the source of the problem.


                                Anyway....I rebooted the controllers.....and I even completely powered off the EVA and started it again...and with just 1 VM guest using it ...the performance was still low.


                                And no...there is no resource limitation or oversubscription...and as I said I tried from multiple VMs that have storage on the EVA and it's exactly the same....on the other hand to VMs hosted on local datastore of the hosts ..transfer speed is around 60-70MB/s

                                Re: Very slow performance on EVA4400
                                  Paul1

                                  I don't think you have a problem with your cache batteries. The blinking is normal. The EVA daily checks the cache batteries and when it is blinking you will see the status "Charging battery" instead of  "Holding charge" when you look under the Enclosure -Tab of the Controller A (or B) in the EVA Command View. I think you will also see messages like this in the Controller Event Log: "The status of the battery assemly '1' has changed."

                                  That's normal and no problem. I would try to look into the EVA with evaperf to analyse if the EVA is the bottleneck. Try "evaperf vdg -cont" and "evaperf hps -cont" and look at the "Average Read Hit Latency" and "Average Read Miss Latency" and "Average Write Latency". All these values should be below 10ms.

                                  Be sure you have Read cache "On" and Write-back" cache enabled on you Vdisk. (It's the default)

                                  You can also monitor the EVA with perfmon, if you prefere a grafic display. If you have high latencies on your EVA, I would look at the SAN-Switch counters with the "porterrshow" command. Reset the counters with "statsclear" and look if some error counters will increase very fast. Maybe you have a problem with one of the LWL-Cables or a Gbic. Analyzing performance problems is alway a time consuming job. Good luck.


