Storage Performance Analysis and Monitoring

Version 11

    This document is a living, wiki version of the performance analysis methods whitepaper .  That document will ultimately be replaced with this one.




    Storage often bounds the performance of enterprise workloads.  More so  than CPU or memory performance investigation, traditional means of  analysis continue to be sound for storage performance in virtual  deployments.  This section will introduce the tools for identifying  heavily-used resources and VMs that have high demands of their storage  system.  Traditional correction methods will then apply.

    iSCSI storage using software initiators is not covered in this section.   When accessed through the hypervisor's iSCSI initiator or an in-guest  initiator traffic will show up on the VMkernel network or the VM's  network stack.  Check the Network section for more information.

    Navigating esxtop

    As before, esxtop is the best place to start when investigating  potential performance issues.  To view the disk adapter information in  esxtop, hit the ‘d' key once it is running.

    On ESX Server 3.5, the storage system can be displayed per VM (using  ‘v') or per storage device (using ‘u').  But the same counters are  displayed on each.  Look at the following items:


    • For each of the three storage views: 
      • On the adapter view (‘d'), each physical HBA is displayed on a row  of its own with the appropriate adapter name.  This short name may be  checked against the more descriptive data provided through the Virtual  Infrastructure Client to identify the hardware type.
      • On ESX Server 3.5's VM disk view (‘v'), each row represents a group  of worlds on the ESX Server.  Each VM will have its own row and rows  will be displayed for the console, system, and other less-important  (from a storage perspective) worlds.  The groups' IDs (GID) match those  on the CPU screen and can be expanded by pressing ‘e'.
      • On ESX Server 3.5's disk device view (‘u'), each device is displayed on its own row.
    • As with the other system screens, the disk displays can have groups expanded for more detailed information: 
      • The HBAs listed on the adapter display can be expanded with the ‘E'  key to show worlds that are using those HBAs.  By finding a VM's world  ID the activity due to that world can be seen on the expanded line with  the matching world ID (WID) column.
      • The worlds for each VM can be displayed by expanding the VM row on the VM disk view with the ‘e' key.
      • The disk devices on the device display can be expanded to show usage by each world on the host.


    Relevant Counters



    Queued Disk Commandsdisk.queueLatency.averageQUEDQueued commands are queued in the kernel queue.  They are awaiting an  open slot in the device driver queue.  A large number of queued commands  means a heavily loaded storage system.  See Storage Queues and Performance for information on queues.
    Queue UsageNot available %USDThis counter tracks the percentage of the device driver queue that is in use.  See Storage Queues and Performance for info on this queue.
    Command Ratedisk.commands.summation ACTVVirtualCenter reports the number of commands that have been issued in  the previous sample period.  esxtop provides a live look at the number  of commands that are being processed at any one time.  Consider these  counters a snapshot of activity.  But don't consider any number here  "too much" until large queues start developing.
    HBA LoadNot available LOADIn esxtop the LOAD counter tracks how full the device queues are.  Once  LOAD exceeds one, commands will start to queue in the kernel.  See Storage Queues and Performance for information on these queues.
    Storage Device Latencydisk.deviceReadLatency

    DAVG/cmdThese counters track the latencies of the physical storage hardware.  This includes everything from the HBA to the platter.
    Kernel Latencydisk.kernelReadLatency

    KAVG/cmdThese counters track the latencies due to the kernel's command processing.
    Total Storage LatencyNot availableGAVG/cmdThis is the latency that the guest sees to the storage.  It is the um of the DAVG and KAVG stats.
    Abortsdisk.commandsAborted.summationABRTS/sThese counters track SCSI aborts.  Aborts generally occur because the array is taking far too long to respond to commands.




    Evaluate the Data

    It is important to have a solid understanding of the storage  architecture and equipment before attempting to analyze performance  data.  Consider the following questions:


    • Is the host or any of the guests swapping?  The guest's swap  activity must be checked with traditional OS tools and the host can be  checked with SWR/s and SWW/s counters detailed in Memory Performance Analysis and Monitoring.
    • Are commands being aborted?  This is a certain sign that the storage  hardware is overloaded and unable to handle the requests in a manner in  line with the host's expectations.  Corrective action could include  hardware upgrades, storage redesign (increasing spindles on the RAID),  or guest redesign.
    • Is there a large queue?  While less dangerous than abortions, queued  commands are similarly a sign that hardware upgrades or storage system  redesign is necessary.
    • Is the array responding at expected rates?  Storage vendors will  provide latency statistics for their hardware that can be checked  against the latency statistics in esxtop.  When the latency numbers are  high, the hardware could be overworked by too many servers.  As  examples, 2-5 ms latencies are usually a sign of a healthy storage  system reading data on the array cache, 5-12 ms latencies reflecting a  healthy storage architecture were data is being randomly read across the  disk, and 15 ms latencies or greater possibly representing an  over-utilized or misbehaving array.


    Identifying a Slow Array

    Its worth pausing at this moment to point out that 95% of all storage  performance problems are not fixed in ESX.  Believe me, I (Scott) have  been called into a dozen performance escalations where poor storage  performance was blamed on the hypervisor and not a single one was being  caused by ESX.  If you're seeing high latencies in VirtualCenter or  esxtop to the storage device, its worth treating this problem as an  array configuration issue.  Check ESX's logs for obvious storage errors,  check array stats, and make sure that there are no fabric configuration  problems.

    At the point of high storage latencies you shouldn't be using complex  benchmarks to reproduce and solve this problem.  Go with Iometer and  make certain you're doing an apples-to-apples comparison against a  physical system (ideally dual-booted from the ESX server under test) to  make sure of what your expected, non-virtual results are.  Check Storage System Performance Analysis with Iometer for information on using Iometer for problems like this.

    Correct the System

    Corrections for these problems can include the following:

    1. Reduce the guests and host's need for storage.
      1. Some applications such as databases can utilize system memory to  cache data and avoid disk access.  Check in the VMs to see if they may  benefit from increased caches and provide more memory to the VM if  resources permit.  This may reduce the burden on the storage system.
      2. Eliminate all possible swapping to reduce the burden on the storage  system.  First verify that the VMs have the memory they need by checking  swap statistics in the guest.  Provide memory if resources permit.   Next, as described in the "Memory" section of this paper, eliminate host  swapping.
    2. Configure the HBAs and RAID controllers for optimal use.  It may be worth reading Storage Queues and Performance for information on how disk queueing works.
      1. Increase the number of outstanding disk requests for the VM by  adjusting the "Disk.SchedNumReqOutstanding" parameter. For detailed  instructions, check the "Equalizing Disk Access Between Virtual Machines " section in the "Fibre Channel SAN Configuration Guide".  This step  and the following one must both be applied for either to work.
      2. Increase the queue depths for HBAs. Check the section "Setting Maximum Queue Depth for HBAs " in the "Fibre Channel SAN Configuration Guide" for detailed  instructions.  Note that you have to set two variables to correctly  change queue depths.  This step and the previous one must both be  applied for either to work.
      3. Make sure the appropriate caching is enabled for the disk  controllers.  You will need to the vendor provided tools to verify this.
    3. If latencies are high, inspect array performance using the vendor's  array tools.  When too many servers simultaneously access common  elements on an array the disks may have trouble keeping up.  Consider  array-side improvements to increase throughput.
    4. Balance load across the physical resources that are available.
      1. Spread heavily used storage across LUNs being accessed by different  adapters.  The presence of separate queues for each adapter can yield  some efficiency improvements.
      2. Use multi-pathing or multiple links in case the combined disk I/O is higher than a single HBA capacity.
      3. Using VMotion, migrate IO-intensive VMs across different ESX Servers, if possible.
    5. Upgrade hardware, if possible.  Storage system performance often  bottlenecks storage-intensive applications but for the very highest  storage workloads (many tens of thousands of IOs per second) CPU  upgrades at the ESX Server will increase the host's ability to handle  IO.


    Top-level performance analysis page: Performance Monitoring and Analysis

    VirtualCenter performance counters: Understanding VirtualCenter Performance Statistics

    esxtop performance counters:  esxtop Performance Counters

    Fibre Channel SAN Configuration Guide

    Storage Queues and Performance