To address memory overcommit (no pun intended), I would like to offer up a quick case study where memory overcommit works in my company of 160,000 employees. A few weeks ago I begain posting preliminary analysis of running production Citrix Presentation Servers inside VMs on ESX 3.x (see http://communities.vmware.com/message/863920). My preliminary findings show that the production Citrix VMs are taking advantage of Content Based Page Sharing (for a detailed explanation of VMware's Content Based Page Sharing, see Carl Waldspurger's whitepaper on Memory Resource Management in VMware ESX Server at http://www.waldspurger.org/carl/papers/esx-mem-osdi02.pdf). One virtualized Citrix server is handling 50-85 sessions and it's not full yet. Each of the sessions is running one of three published applications that all share the same base PowerBuilder code and .DLLs (about an 80MB memory footprint for each session). Because each of the 50-85 sessions shares the same code, the VMware's Content Based Page Sharing consolidates many of the identical pages into single read only reference points and discards the redundant copies. The net result is significant ESX host memory savings. As an example, what I'm seeing inside the Citrix VM is nearly 4GB of RAM being used, but from the ESX host perspective, 1GB or less physical RAM is being utilized, leaving the additional 3GB of physical RAM for another VM in the cluster to use. Now multiply this memory savings by the number of virtualized Citrix servers and the memory savings adds up quickly.
In the blog banter, a point is made by the opposition that it's going to be rare for a number of VMs on the same host to all be identical such that there would be a significant savings in memory page sharing. My example is proof that you don't even need multipe VMs to take advantage of VMware's page sharing and memory overcommit. The fact is that VMware's Content Based Page Sharing works both inter-VM (across VMs) and intra-VM (within a single VM). I verified this with Carl Waldspurger himself a few weeks ago. Summary: Citrix VMs running silo'd or partitioned application sets take advantage of intra-VM Content Based Page Sharing.
On the inter-VM Content Based Page Sharing front, an assumption is made by the opposition that rarely will VMs share the same code to amount to any sort of memory savings. I don't have hard numbers yet but I will say that in our VDI deployment that we are rolling out, all Windows XP images are rolled out from the same template which contains all of the tools each VDI user needs to do business. While there is no guarantee each VDI user will concurrently be running the same applications within their VDI, it's a fact that at a minimum they will each be running the same base operating system (Windows XP), Microsoft Outlook plus all of the MS Office code that loads in the background and at startup. Most will probably have at least 1 IE browser open as well, although the IE footprint is fairly minimal in the grand scheme of things. Depending on the number of virtual desktops we deploy on each host, I think we'll see generous memory savings. This is not to say we will intentionally overcommit memory. Ballooning and VMKernel swap allows memory overcommit but I'm not that desperate to take the performance hit that comes with it, although I'm not sure whether or not our VDI users would be able to tell the difference or not.
[VMware Communities User Moderator|http://communities.vmware.com/docs/DOC-2444][/i]
I'm going to repeat myself from another thread... Expect some more updates from us, but it's not about how you can manipulate the numbers on a spreadsheet -- the point is that this stuff is useful in the real world.
Jason, thanks! Anybody else? I was talking with an SE last week, and he was telling me about a recent customer where the Windows group was happily using VMware, but the Linux group thought they should go with an open source solution, and besides they thought that memory overcommit would hurt performance. Well, some of their Linux boxes were already on VI3, and they quickly brought up VC and looked -- on some hosts it was as high as 20:1, and the lowest was 3:1, and all the workloads had great performance.
So the point is not that you can't misconfigure memory overcommit, but that in many every day situations it can be incredibly useful to get consolidation ratios unheard of with other virtualization platforms. Other vendors are going to badmouth it just until they have it too.
So what's your overcommit ratio?
Not quite overcomitting (yet), as we haven't workloads enough to cause us to, but here is the situation on one of our production hosts:
Click image for higher quality.
As you can see this host is servicing workloads totaling 15,616 MB of committed memory on a host with 16,384 MB of memory. That said, only 9,260 MB of memory is in use!
For the disbelievers the workloads on this host consist of:
1 production IIS 6.0 Server
1 production BSD DNS Server
1 lab Microsoft Office SharePoint Services 2007 Enterprise Server
1 production high volume front end Exchange Server 2003
1 production Windows Server 2003 Terminal Server
1 production back end ( mailstore ) Exchange Server 2003
2 production RIM Blackberry Enterprise Servers
1 production Windows Server 2003 management/utility server
1 development Redhat EL4 server
1 production Windows Server 2003 file server.
All of this on 1 host, and not really taking advantage of Page Sharing to it's fullest extent. There is a mix of OSes and patch levels there, as well as dissimilar applications and roles. Most of these systems have performance and availability SLAs. Any problem we've ever had has been due to isues with firmware levels on our SAN Fabric, or in one issolated case complaints of Foxpro performance that were duplicated in a physical box, and obviously a problem with the app.
As the VMware subject matter expert on this team I can honestly say I have no complaints about the solutions VMware provides vs. what they claim.