Understanding Ready Time

Version 1

    Introduction

     

    Ready time is the amount of time a VM wants to run but has not be provided CPU resources on which to execute.

     

    Somewhat confusingly, ready time is reported in two different values between esxtop and VirtualCenter. In esxtop is reported in an easily-consumed percentage format. A number of 5% means the VM spent 5% of its last sample period waiting for available CPU resources. In VirtualCenter ready time is reported as a time measurement. In VC's real-time data, which produces sample values every 20,000 ms, a number of 1,000 ms is reported for a 5% ready time.

     

    As part of their performance and capacity planning, ESX Server administrators have looked at the statistics for processes running on the host and used the ready time metric as one of the inputs. Ready time can be an indicator of saturation on a system. Users sometimes equate the ready time observed with run queues on Unix or Linux. The fact that run queues are reported on a per-processor basis while ready time is reported for each virtual machine (or each virtual CPU in the case of multiprocessor virtual machines) causes the metric to be slightly different. Some users have asked how a virtual machine can accumulate ready time while it appears that CPUs are also accumulating idle time. This document attempts to address some of these questions about ready time.

     

     

    This document was syndicated from Ready Time and ESX3 Ready Time (pdf).  Only highlights are taken from these source documents: please see those originals for all source content.

     

    Intended Audience

     

    VMware Certified Professionals (VCPs) and capacity management professionals who are seeking to understand ESX and virtual machine performance.

     

    Outline

     

    • Understanding Ready Time

    • Providing Access to Computing Resources

    • Test Description and Findings

    • Processor Utilization versus Ready Time

    • Ready Time on SMP and Uniprocessor Virtual Machines

    • Scheduler Trace

    • Interpreting Ready Time

     

     

    1. Understanding Ready Time

     

     

     

    The most common question we get on ready time is, "what ready time numbers constitute a problem?" While there is no easy answer to this, we can offer some guidance on the acceptable values. But before I lay that out, let me say that ready time should notbe the ultimate measurement of system performance. As always, user experience and latency should be. There are some situations where user experience is horrible on a system with no load and virtually zero ready time. This could happen with a mis-configured array, as an example. And occasionally we see aggressively-consolidated hosts showing very high ready times that are meeting user needs. There are no absolutes with ready time.

    But, there are a few general regions into which ready time values can be binned. Note that these ready time values are per vCPU. esxtop reports ready time for a VM once its been summed up across all vCPUs. That means that 5% ready on each of four vCPUs will be reported as 20% ready at the VM level. This is the high end of a very light amount of ready time.

     

    Value, per vCPU

    Description

    r == 0%

    This doesn't happen. The very presence of a hypervisor between the operating system and the hardware means that there is a non-zero ready time on all operations. But on healthy systems this number is so small that end-users don't know their workload has been virtualized. See the next section.

    0% < r <= 5%

    This is the "normal" region for ready time. Very small single digit numbers result in a minimal impact to user experience. If performance problems exist on the system and ready time falls into this region, your problems lie elsewhere.

    5% < r <= 10%

    In this region ready time is starting to be worth watching. Most systems function healthily with ready time in this region but highly sensitive measurements may be suffering.

    10% < r

    While some systems continue to meet expectations, double-digit ready time percentages often mean some action is required to address performance issues. See the last section for guidance.



    Again, remember that VirtualCenter performance numbers must be re-calculated to percentages to find the category on the above table. But since VC reports ready time per vCPU, no special arithmetic is needed to account for the number of vCPUs in the VM (as is needed with esxtop.)

     

    Causes and Correction

     

    There are two general areas that can cause unnecessarily high ready times:

     

    1. Overloaded hosts.

     

    1. Excessive use of SMP.

     

    Host Overloading

     

    The most common cause of high ready time is trying to get too much work out of too little hardware. Consider the following simple case: on a hypothetical system with only one physical CPU, if two 1-way VMs are fully loaded by their users then each wants to have an entire CPU. Because only one is available, ESX will time share that resource and give each of them only 50% of the CPU. As a result, each VM will spend 50% of its time waiting for the processor. This would be reported as 50% ready time.

    Often this condition is observable when ready time is high and total host CPU utilization is also very high. The only fix for this is to back off the load on the system. VMs should be migrated off or processor resources should be increased.

     

    Excessive SMP

     

    In ESX Server 2.5, SMP guests had to be co-scheduledto start at the exact same moment. If a 2-way VM was ready to run but only one physical core was available, the VM would not be scheduled until a second core was freed up. This would increase its ready time. In ESX Server 3.0 and later versions, relaxed co-scheduling was introduced which meant that a subset of a VM's vCPUs could be scheduled ahead of others. However, guest operating systems still require some degree of co-scheduling which means that the relaxation isn't absolute. In short, increasing vCPUs still puts some burden on the scheduler to try and co-schedule the vCPUs that can increase ready time. This is one ready why VMware advises only allocating vCPUs to VMs that are using them. Read Co-scheduling SMP VMs in VMware ESX Server  for more information on co-scheduling.

    This condition is manifested by hosts that have sub-optimal CPU utilization and lots of SMP VMs. A host may have a dozen 4-way VMs with each showing high ready time but only be at an aggregate 40% CPU utilization. This is a clear sign that the scheduler is spending a great deal of time managing unneeded vCPUs.

     

    2. Providing Access to Computing Resources

     

     

    Whenever a resource is shared, there is a chance that an attempt to use the shared resource will not be fulfilled immediately because the resource is busy. When multiple processes are trying to use the same physical CPU, that CPU may not be immediately available and a process must wait before ESX Server can allocate a CPU to it. The ESX Server scheduler manages access to the physical CPUs on the host system. The time a virtual machine or other process waits in the queue in a ready-to-run state before it can be scheduled on a CPU is known as ready time.

     

    Several factors affect the amount of ready time seen.

     

    • Overall CPU utilization

     

    You are more likely to see ready time when utilization is high, because the CPU is more likely to be busy when another virtual machine becomes ready to run.

     

    • Number of resource consumers (in this case, guest operating systems)

     

    When a host is running a larger number of virtual machines, the scheduler is more likely to need to queue a virtual machine behind one or more that are already running or queued.

     

    • Load correlation

    If loads are correlated — for example, if one load wakes another one when the first load has completed its task — ready times are unlikely. If a single event wakes multiple loads, high ready times are likely.

     

    • Number of virtual CPUs in a virtual machine.

     

    When co-scheduling for n-way Virtual SMP is required, the virtual CPUs can be scheduled only when n physical CPUs are available to be preempted.

     

    In multiprocessor systems, an additional factor affects ready time. Virtual machines that have been scheduled on a particular CPU will be given a preference to run on the same CPU again.  This is because of performance advantages of finding data in the CPU cache. In a multiprocessor system, therefore, the ESX Server scheduler may choose to let a few cycles on a CPU stay idle rather than aggressively move a ready virtual machine to another CPU that may be idle. Virtual machines can and do move occasionally when deemed beneficial by the scheduler algorithm.

     

    Scheduler options can be used to make this migration more aggressive; however, our tests indicate that this results in a lower overall system throughput and may not be desirable.  This adds an additional complication in understanding ready time on multiprocessor systems. Even when a CPU is idle, a virtual machine may be ready and waiting — that is, accumulating ready time.

     

     

     

    3. Test Description and Findings

     

     

    The objective of these tests was to establish patterns correlating ready time, CPU utilization, and use of SMP virtual machines, and to look at scheduler event traces (available only in ESX Server 3.0) to see if CPUs were being used efficiently. For the most part the analysis was done on the beta 2 version of ESX Server 3.0, though test results were also collected on ESX Server 2.5.2.

     

    The tests were run on a Hewlett-Packard DL 585 with four AMD Opteron CPUs at 1.6GHz and 4GB of RAM. We set up 10 virtual machines on this server, each with 256MB of RAM. Six of the virtual machines were set up as uniprocessor virtual machines. The other four had two CPUs each. All the virtual machines were running Red Hat Enterprise Linux 3.

     

    The workload used in the tests was a custom CPU burner program, a single-threaded load generator. The program ran a fixed number of iterations of a CPU-intensive activity, then slept for some time. The amount of time spent in the compute-intensive loop or in sleep was in the same order as the default scheduler quantum (the time a guest can be on a CPU before it is descheduled to give other guests a turn — 50ms by default). On a system that is very busy, the behavior of the program becomes somewhat less predictable. The work element of the program is a fixed quantum. However, the sleep time is dependent on the guest operating system in the virtual machine delivering an interrupt to tell the CPU burner it is time to wake up.  As the CPU the virtual machines are running on gets busier, it is possible that the virtual machine does not get scheduled to deliver this interrupt in a timely manner and the actual sleep periods thus get longer.

     

    In a real-world workload, if a guest operating system is receiving a steady stream of work, it tends to accumulate more work, waiting in its queue, while the virtual machine is not active. In this simulation, on the other hand, work does not accumulate while the virtual machine is inactive. Thus in this simulation, the amount of work done falls as the system gets busier.

     

    For the purposes of this test, this decrease in the amount of work done is not a problem, because we want to compare actual utilization with accumulated ready time. However, utilization does not grow in a linear fashion as load is added to the system.

     

    Data for all tests was collected over a 10-minute period. The esxtop tool took periodic snapshots of the system. We used those snapshots to calculate how much ready time was being accumulated by virtual machines and other processes on the server. Utilization and ready time used here are aggregates for the virtual machines. Utilization and ready time from all other components of ESX Server are small and were not taken into consideration for this test.

     

    4. Processor Utilization versus Ready Time

     

     

    The tests illustrate that with ESX Server 2.5, the ready time starts to increase dramatically between 55 and 60 percent utilization. With the scheduler changes made in ESX Server 3.0, this increase occurs around the same time but is much less sharp. At higher utilizations, the ESX Server 3.0 scheduler does a better job of servicing virtual machines efficiently, thus overall ready time remains lower.

     

    Because ready time is time a process spends waiting when it could be running, it has a direct impact on response time. During the measurement period, because the virtual machines were all pinned to a single CPU, 600 seconds of CPU time were available. In the case in which four of the virtual machines have load, about 300 seconds of ready time were accumulated. An average 50 percent slowdown would therefore be expected. An event with a response time of 1 second may be expected to take 1.5 seconds under this level of load as a direct result of delays caused by ready time. This is not unexpected in observations of multiuser systems under load. The same factors apply to an ESX Server host running multiple virtual machines.

     

    It is important to note that increasing amounts of ready time do not mean that the remaining CPU on the system is unusable. It just means that due to such factors as load synchronicity, there are periods when the CPU has no work to do and other times when it is running one virtual machine but has one to five others ready to run and waiting.

     

     

    5. Ready Time on SMP and Uniprocessor Virtual Machines

     

     

    The results indicate that in ESX Server 3.0 there is little difference in ready time, whether this load is running on uniprocessor or SMP virtual machines. The scheduler overall does a better job of ensuring all the virtual machines get scheduled frequently enough, resulting in an actual usage of 75 percent — in contrast to ESX Server 2.5, on which the actual usage stays much lower.

     

    Scheduler changes and other improvements in ESX Server 3.0 result in a difference in utilization, thus preventing a direct comparison between ready times for the two versions. It is clear, however, that co-scheduling does not have a large impact on ready time in ESX Server 3.0.

     

    6. Scheduler Trace

     

     

    ESX Server 3.0 provides a mechanism for looking at scheduler trace data to examine state information on processes and CPUs and to validate decisions made by the scheduler. The mechanism to collect this data is available in the released version of ESX Server 3.0. The data is collected by the vm-support script. However, the ability to post-process the script’s binary output file is not part of the released product

     

    7. Interpreting Ready Time

     

     

    Ready time for a process in isolation cannot be identified as a problem. The best metrics for examining the health of a server continue to be CPU utilization, response time, and application queues.

     

    It is normal for a system to accumulate some ready time even when overall CPU utilization is low. Take an example of two processes (A and B) that each use 20 percent of a CPU, for an overall utilization of 40 percent. When process B is being scheduled, statistically 80 percent of the time the CPU is idle. The remaining 20 percent of the time process B must wait for process A to finish. The same is true for process A — 20 percent of the time it must wait for process B to finish.

     

    This demonstrates that even under low utilization there is a chance that a shared resource will be busy. Thus some ready time is to be expected and is not a problem. The behavior is no different in the case of an ESX Server host with multiple running virtual machines. It behaves essentially the same way that an operating system does when trying to run multiple tasks concurrently.

     

    The objective of server consolidation is to drive CPU utilization higher. For applications that are purely throughput driven it may be possible to drive the system to full utilization. For systems servicing applications that are somewhat interactive, attempting to drive the utilization beyond 60 to 70 percent may result in a perceptible lag in user activities. If response time thresholds are established for applications, they can give a clear picture of whether service levels demanded from an application are being met.

     

    Resources

     

     

     

    Authors

    • VMware Inc.