5 Replies Latest reply on Nov 6, 2010 9:26 AM by chadwickking

    Troubleshooting DISK Latency vscsiStats

    chadwickking Expert

       

      Hello forums.

       

       

      Could use some expert insight for a problem I am having with some VM's.  Resources utilization is not at capacity in any way.  No CPU contention but VM's are running slow - and often types its bad or sometimes just slow.  The VM's didn't have this but for a while now I know we have been adding more and more VMs to our lab so this could explain the Latency

       

      I would like for someone to offer some assistance on using vscsiStats for troubleshooting vmdisk related problems.  On the vm I was looking at in perfmon i noticed a somewhat long average disk queue length.  I ran vscsiStats and got the following after 30 minutes.  Please let me know if I understand this correct.

       

       

      vscsiStats ran with latency string only included latency of IOs not read/write.  Both disk are on the same datastore.

       

       

      VMDISK 1

      Histogram: latency of IOs in Microseconds (us)       virtual

      machine worldGroupID

       

       

      min           121

      max          14440236

      mean       327165

      count       15524

      Frequency               Histogram

      Bucket Limit

       

       

      0                            1

      0                            10

      0                            100

      1488                      500

      1929                      1000

      1453                      5000

      1618                      15000

      1688                      30000

      1070                     50000

      1546                      100000

      4732                      100000

       

      VMDISK 2 -

       

       

      Histogram: latency of IOs in Microseconds (us)       virtual

      machine worldGroupID

       

       

      min           159

      max          14245418

      mean       478629

      count       3446

       

       

      Frequency               Histogram

      Bucket Limit

       

       

      0                                 1

      0                             10

      0                             100

      1722                       500

      146                         1000

      136                         5000

      65                           15000

      59                           30000

      46                           50000

      88                           100000

      1184                       100000

       

       

      I really dont understand our storage layout very well and I apologize for that I am trying to learn it.  We do use NetAPP but I dont think we use all NetAPP storage.  These Disk are attached with 4 GBPS HBA that are SATA disk.

       

      My Understanding from the above is the following:

      4732 Writes => .1 tenth of a second

      1546 Writes<= .1 tenth of a second

      1070 Writes <= .05 hundredths of a second

       

      My math may be a bit off but is this a lot of latency for disk i/o? Your help is appreciated.

       

       

      Cheers,

      Chad King

      VCP-410 | Server+

       

      Twitter: http://twitter.com/cwjking

       

      If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

       

       

        • 1. Re: Troubleshooting DISK Latency vscsiStats
          chadwickking Expert

          If anyone could help it would be appreciated! thanks.

           

           

           

           

          Cheers,

          Chad King

          VCP-410 | Server+

           

          Twitter: http://twitter.com/cwjking

           

          If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

          • 2. Re: Troubleshooting DISK Latency vscsiStats
            depping Champion
            VMware EmployeesUser Moderators

            You are correct this is the latency per I/O. It means that large portion of your IOs have latency higher than 100 miliseconds, this is fairly high to be honest and I am not surprised if you would notice overall sluggishness. I wonder if you for instance have checked if your disks are aligned or not? This could contribute to this level of latency. I also wonder how many links you have going back to your filer?

             

            Another thing worth investigating is the utilization of the processors of the filer.

             



            Duncan

            VMware Communities User Moderator | VCDX

            -


            Now available: Paper - vSphere 4.0 Quick Start Guide (via amazon.com) | PDF (via lulu.com)

            Blogging: http://www.yellow-bricks.com | Twitter: http://www.twitter.com/DuncanYB

            • 3. Re: Troubleshooting DISK Latency vscsiStats
              chadwickking Expert

              Thanks Duncan,

              Sorry for getting back so very late.  I have been very busy at work and have little time to do the VM troubleshooting. I supposed by link you mean possibly the amount of paths? I know we use two 4Gbps cards per host connected back to the SAN.  This is also in our lab but I am glad to see that I am reading vSCSIstats correctly. 

               

              Apparently they added more memory to the VM and they are not having a problem anymore.  I dont think this is what fixed their problem because when they get busy again I think it will occur again.  Plus they also booted the VM.  Either way I am going to do some further research in the lab and see what I can gather from others host.

               

              Thanks again Duncan!

               






              Cheers,

              Chad King

              VCP-410 | Server+

               

              Twitter: http://twitter.com/cwjking

               

              If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

              • 4. Re: Troubleshooting DISK Latency vscsiStats
                chadwickking Expert

                Considering this is one VM in general.  I supposed I could get a larger sample and see what its like.  Do you have any other recommendations?

                 






                Cheers,

                Chad King

                VCP-410 | Server+

                 

                Twitter: http://twitter.com/cwjking

                 

                If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

                • 5. Re: Troubleshooting DISK Latency vscsiStats
                  chadwickking Expert

                  They have two different types of storage.  All are connect by FC but the one that is getting hammered is the SATA NetAPP storage they are using.  It appears they have turned off deduplication because of the problems with I/O being high - though I would like more detail in this and would love read into it.  They have a lot of VMs hitting that storage and moved them over to a SAS-FC storage and since have not had near as many issues.  You would think after going through this hell in their VDI environment they would know this.  Thanks again.

                   

                   

                   

                   

                  Cheers,

                  Chad King

                  VCP-410 | Server+

                   

                  Twitter: http://twitter.com/cwjking

                   

                  If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful