At VMworld 2008 in Las Vegas several of us in our virtual performance team met with a variety of customers to talk about Microsoft SQL Server. We already had a large base of customers running very many SQL Server DBs on our products and we wanted to collect information on the challenges posed in the process of virtualizing this critical workload. We were pleased to see that ESX Server handled SQL VMs with excellent performance. But, for many customers, the first efforts at virtualizing SQL didn't yield high-performing SQL VM. After careful investigation and many, many discussions we've started to put together the puzzle as to where SQL Server performance problems come from. This page will document these common problems, borrowing slides from our presentations on the subject.
We've talked with dozens of customers in the past months to document the issues that resulted in poor SQL performance. Happily, none of the issues were due to underlying technologies. Here is a list of issues and an explanation of the impacts. These items are roughly listed in the order of decreasing likelihood of occurrence.
Storage configuration problems are the number one cause of SQL performance issues. Usually these problems arise because the DBA requests a virtual disk of the VI admin, the VI admin places the VMDK on a LUN that may or may not meet the DBA's performance needs. For instance:
This may be obvious to some, but this problem occurs again and again. The VI administrator should be aware of a few technical items that can help understand and avoid this problem:
Often companies that are dipping their metaphorical toes into
virtualization want to run proof-of-concept (POC) experiments to verify
that the virtual platform can meet their performance expectations. But
its surprising how many times these experiments are run on older,
poorly-performing hardware. Presumably the shiny, new systems were in
use for production applications so only the mothballed, cobweb-covered
servers from a previous generation were available for the POC. This
causes many problems. Check out this slide from a talk on SQL Server at VMworld Europe 2009:
The slide points out a couple of things. First, the larger caches and shorter pipelines on newer Intel processors results in a considerable drops in performance overheads. Second, the latency of the VMEXIT instruction, which determines the amount of time it takes to transition from the VM to the VMkernel, has shrunk by a large amount with subsequent generations of hardware. And don't forget the other additions from Intel and AMD such as hardware assisted memory management and IO virtualization.
Microsoft has kindly provided a web page of best practices for SQL Storage configuration. These be practices should still be followed when configuring your virtual SQL deployments!
For many SQL Server POCs the goal is to measure the VM's ability to perform, with respect to the virtual platform. If this comparison is to be performed, its critical that the VM be configured identically to the physical hardware. Obviously this means that the VM should be run on the same hardware using identically configured LUNs. Its also important to ensure that the VM has the same number of vCPUs and amount of memory as the physical baseline. This means restricting the number of pCPUs and amount of memory with NUMPROC and MAXMEM, respectively, in boot.ini.
It also means that the test being applied should be understood. If a benchmark is chosen that uses a very small database, the content will be cached and the storage system won't be used. This can skew the results and produce recommendations not consistent with production deployments. Here is another slide from the same VMworld Europe 2009 presentation detailing some of what we know about the SQL Server benchmarking alternatives:
We at VMware prefer DVD Store.
VMware's hosting products, VMware Server, VMware Workstation, and even VMware Fusion, are all capable of running SQL Server. But if the database is going to be run in production on enterprise-class hardware, use VMware's enterprise-class hypervisor: ESX Server. These products are not often confused by the initiated but rogue members of large companies often run off-the-books proof-of-concept experiments on VMware's hosted products. When they produce results they don't like, the results get spread throughout the company which can slow the virtual deployment.
Consider the following data, again from the VMworld Europe 2009 SQL Server presentation:
This information is getting a bit dated now, as it was performed years ago on ESX Server 3.0. But the point stands: before believing results claiming that "VMware cannot run SQL Server" its worth investigating the platform used to generate the results.
Database performance is heavily dependent on the amount of memory available. Almost without exception, providing more memory to SQL Server will improve performance. However, if that memory is coming from a host that is already over-committed or is being provided through workarounds to 32-bit limitations, performance may suffer. Here are a few keys for SQL Server memory management:
This item is really a special but very important case of item two, follow best practices. Partition alignment can impact storage performance which can be critical to some SQL Server VMs' performance. See VMware's paper on partition alignment for more information on this.