VMware

drummonds

drummonds's Profile

  • Name: Scott Drummonds 
  • Email: (Private)
  • Member Since: May 31, 2007
  • Last Logged In: Nov 15, 2009 10:57 PM
  • Status Level: Hot Shot Hot Shot (464 points)
  • VMware Employee: VMware Yes
  • Location: Palo Alto, CA
  • Occupation: Performance Marketing
  • Homepage: http://www.vpivot.com
  • Biography: Scott Drummonds has been working for VMware since January of 2007. He participates in a wide variety of performance issues including VDI, performance problem solving, field support, competitive analysis, and general marketing activities. More at: * YouTube: http://www.youtube.com/user/drummonds1974 * LinkedIn: http://www.linkedin.com/in/drummonds1974 * Twitter: http://twitter.com/drummonds
  • Signature: More information on my blog and on Twitter: http://vpivot.com http://twitter.com/drummonds

drummonds's Latest Content


I have moved my blog home to a new location. Come visit and read at vPivot.com.

Scott

0 Comments Permalink

A couple of days ago we finally got out one of my favorite papers from our ongoing vSphere launch activities. This paper on ESX memory management, written by Fei Guo in performance engineering, has three graphs that are absolute gems. These graphs show balloon driver memory savings next to throughput numbers for three common benchmarks. The conclusion is inescapable: the balloon driver reclaims memory from over-provisioned VMs with virtually no impact to performance. This is true on every workload save one: Java.

Example 1: Kernel Compile

Linux kernel compilation models a common developer environment involving a large number of code compiles. This process is CPU and IO intensive but uses very little memory.

Picture 1.png

Results of two experiments are shown on this graph: in one memory is reclaimed only through ballooning and in the other memory is reclaimed only through host swapping. The bars show the amount of memory reclaimed by ESX and the line shows the workload performance. The steadily falling green line reveals a predictable deterioration of performance due to host swapping. The red line demonstrates that as the balloon driver inflates, kernel compile performance is unchanged.

Kernel compilation performance remains high with ballooning because this workload needs very little memory and the guest OS can easily take unused pages from the application. Performance falls with swapping because ESX randomly selects virtual machine pages for swapping, whether those pages are in use by the application or not. The guest OS is better at selecting pages for reclamation than ESX is.

Example 2: Oracle/Swingbench

Oracle's database is best tested against Swingbench, the OLTP load generation tool provided by Oracle. Database workloads utilize all system resources but show a non-linear dependence on memory. Memory can be safely reclaimed from OSes running databases until the cache becomes smaller than needed by the workload. The following figure shows this.

Picture 2.png

As before, the virtual machine using only ballooning maintains higher performance under memory pressure than the virtual machine whose memory is being swapped away by the host. Performance is constant and shows no negative impact due to ballooning until the balloon encroaches on the SGA. Again, ESX's host swapping randomly selects pages to send to disk which degrades performance even at small swap amounts.

As with kernel compile, the balloon driver safely reclaims memory from over-provisioned VMs with little impact to application performance.

Example 3: Java/SPECjbb

Java provides a special challenge in virtual environments due to the JVM's introduction of a third level of memory management. The balloon driver draws memory from the virtual machine without impacting throughput because the guest OS efficiently claims pages that its processes are not using. But in the case of Java, the guest OS is unaware of how the JVM is using memory and is forced to select memory pages an arbitrarily and inefficiently as ESX's swap routine.

Picture 3.png

Neither ESX nor the guest OS can efficiently take memory from the JVM without significantly degrading performance. Memory in Java is managed internal to the JVM and efforts by the host or guest to remove pages will equally negatively impact Java applications performance. In these environments it is wise to manually set the JVM's heap size and specify memory reservations for the virtual machine in ESX to account for the JVM, OS, and heap.

Conclusions and Scott's Special Recommendation

Love your balloon driver. Your application owners are always asking for more memory than they need. With great comfort you can over-provision memory some and rely on ESX and the balloon driver to reclaim what is not in use. Without the balloon driver, ESX will be forced to use its last technology for managing memory over-commit: host swapping. And host swapping always decreases performance.

So here is my special recommendation for you: never, ever disable the balloon driver. This forces the host to swap that virtual machine's memory, should that resource become scarce. And where ballooning usually will not hurt performance, swapping always will. If you must protect an application from memory reclamation due to memory over-commitment, use reservations. They make admission control more effective, they self-document the needs of the VM, and they are easily configured.

0 Comments Permalink

I spent a great deal of time answering customers' questions about the scheduler. Never have so many questions been asked about such an abstruse component for which so little user influence is possible. But CPU scheduling is central to system performance, so VMware strives to provide as much information on the subject as possible. In this blog entry, I want to point out a few nuggets of information on the CPU scheduler. These four bullets answer 95% of the questions I get asked.

Item 1: ESX 4's Scheduler Better Uses Caches Across Sockets

On UMA systems with low load levels, virtual machine performance improves when each virtual CPU (vCPU) is placed on its own socket. This is because providing each vCPU its own socket also give it the entire cache on that CPU. On page 18 of a recent paper on the scheduler written by Seongbeom Kim, a graph highlights the case where vCPU spreading improves performance.

Picture 2.png

The X-axis represents different combinations of VM and vCPU counts. SPECjbb is memory intensive and shows great gains with increases in CPU cache. The few cases that show dramatic benefit due to the ESX 4.0 scheduler are benefiting from the distribution of vCPUs across sockets. Very large gains are possible in this somewhat uncommon case.

Item 2: Overuse of SMP Only Slows Consolidated Environments At Saturation

For years customers have asked me how many vCPUs they should give to their VMs. The best guidance, "as few as possible", seems too vague to satisfy. It remains the only correct answer, unfortunately. But a recent experiment performed by Bruce Herndon's team sheds some light on this VM sizing question.

In this experiment we ran VMmark against VMs that were configured outside of VMmark specifications. In one case some of the virtual machines were given too few vCPUs and in another they were given too many. Because VMmark's workload is fixed, changing VM sizes does not alter the amount of work performed by the VMs. In other words, the system's score does not depend on the VMs' vCPU count. Until CPU saturation, that is.

Picture 3.png

Notice that the scores are similar between the undersized, right-sized, and over-sized VMs. Up until tile 10 (60 VMs) they are nearly identical. There is a slight difference in processor utilization that begins to impact throughput (score) as the system runs out of CPU. At that point wasted cycles dedicated to unneeded vCPUs negatively impact the system performance. Two points I will call out from this work:

  • Sloppy VI admins that provide too many vCPUs need not worry about performance when their servers are under low load. But performance will suffer when CPU utilization spikes.
  • The penalty of over-sizing VMs gets worse as VMs get larger. Using a 2-way VM is not that bad, but unneeded use of 4-way VM when one or two processors suffice can cost up to 15% of your system throughput. I presume that unnecessarily eight vCPUs would be criminal.

Item 3: ESX Has Not Strictly Co-scheduled Since ESX 2.5

I have documented ESX's relaxation of co-scheduling previously (Co-scheduling SMP VMs in VMware ESX Server). But this statement cannot be repeated too frequently: ESX has not strictly co-scheduled virtual machines since version 2.5. This means that ESX can place vCPUs from SMP VMs individually. It is not necessary to wait for physical cores to be available for every vCPU before starting the VM. However, as Item 3 pointed out, this does not give you free license to over-size your VMs. Be frugal with your SMP VMs and assign vCPUs only when you need them.

Item 4: The Cell Construct Has Been Eliminated in ESX 4.0

In the performance best practices deck that I give at conferences I talk about the benefits of creating small virtual machines over large ones. In versions of ESX up to ESX 3.5, the scheduler used a construct called a cell that would contain and lock CPU cores. The vCPUs from a single VM could never span a cell. With a ESX 3.x's cell size of four this meant that VMs never spanned multiple four-core sockets. Consider this figure:

http://communities.vmware.com/servlet/JiveServlet/downloadImage/38-4886-6688/Picture+1.png

What this figure shows is that a four-way VM on ESX 3.5 can only be placed in two locations on this hypothetical two-socket configuration. There are 12 combinations for a two-way VM and eight for a uniprocessor VM. The scheduler has more opportunities to optimize VM placement when you provide it with smaller VMs.

In ESX 4 we have eliminated the cell lock so VMs can span multiple sockets, as item one states. Continue to think of this placement problem as a challenge to the scheduler that you can alleviate. By choosing multiple, smaller VMs you free the scheduler to pursue opportunities to optimize performance in consolidated environments.

2 Comments Permalink

Communities