VMware

Virtual Performance

Scott Drummonds works in a variety of performance areas at VMware: VDI, application best practices, competitive analysis, customer performance investigations, and outward bound communications. This blog will detail some of my musings on these subjects.

4 Posts tagged with the benchmarking tag
0

Its been about 10 days since I posted the YouTube video showing Hyper-V's stability problems in consolidated environments. I immediately received a lot of questions about the configuration that I answered to the best of my ability in my "Video on Hyper-V Crashes" blog entry. Many respondents were not surprised by stability problems with a first-generation product and some people requested more detail on this issue for further discussion. But there were too many comments to address in all.

One of the more interesting emails I received pointed out that it unreasonable to blame Hyper-V for the collapse of these very large and very busy websites. Hyper-V's stability issues would bring down individual VMs or small groups when the parent partition blue screened. I think that this is a reasonable observation, so its worth including here. I can't say that Hyper-V was responsible for the MSDN and TechNet crashes. That would be for Microsoft to say, when and if they choose to expose the issue behind the outage.

Lastly, all comments come from people that fall into one of two categories: one camp thinks the video captures are bogus and the other believes they're based on a real, reasonable, repeatable workload. I'm not going to try and move you from one camp to the other.

It is clear that a small, vocal, and surprisingly profane number of you think that I made this whole thing up. The premise of this latter group appears to be that Microsoft wouldn't make a product that a customer could crash under normal conditions. If this is your reasoning then no video, discussion or demonstration is going to change your mind. I'll let everyone else make their decisions based on Microsoft's track record and his or her experience with Microsoft products.

Update: 5/15/09

The team responsible for the research has deciced to post details: Setting the Record Straight on the Hyper-V Video

0 Comments Permalink
0

Video on Hyper-V Crashes

Posted by drummonds VMware May 15, 2009

Since I posted the YouTube video showing Hyper-V blue screens last Friday I've received a lot of comments, questions, compliments and complaints. The video and descriptive text have raised more questions than answers, so here are a few details to help fill out the story.

  • The workload was not technically VMmark. There are two reasons for this:
    • VMmark's run rules specify that the VMs must be configured with a single virtual disk. Because this configuration can't make use of Hyper-V's paravirtualized SCSI driver, which requires a second virtual disk, the run rules were violated to make Hyper-V produce its best results.
    • The vendors that provided requirements for VMmark included use of SMP Linux guests. Hyper-V's lack of support for these configurations means that it is unable to run VMmark according to the rules. Those rules were ignored by the test team and the ESX and Hyper-V tests were run with uniprocessor Linux guests so that Hyper-V was able to produce some number.
  • The server ran 15 tiles* when ESX was installed. So, the hardware is good.
  • The server successfully ran 10 tiles* when Hyper-V was installed, although at a much higher CPU utilization and lower throughput than ESX. The server seems to run Hyper-V correctly.
  • The 11-tile* run was tried many, many times. Hyper-V was unable to run 11 tiles without guest blue screens or the parent partition crashing and bringing down the server.

(*) As detailed in the first bullet, these aren't real "tiles". They have been dumbed down (Linux SMP) and reconfigured (extra virtual disk) to work around Hyper-V limitations.

I'm hoping to convince the people responsible for the test to shed their anonymity and come out with an official paper. I'll provide those details as soon as I can get them.

Update: 5/15/09

The team reasonable for the research has posted details of the experiment. Read more at Setting the Record Straight on the Hyper-V Video.

0 Comments Permalink
2

Microsoft SQL Server runs at roughly 80% of native on VI3 in most benchmarked environments. In production environments, and under loads that model those conditions, SQL Server runs at 90-95% of native on ESX 3.5. I can say this with confidence despite a large amount of the industry's skepticism because I've spent so much time on SQL Server in the past half year. I'd like to share some of my research on the subject and observations with you.

Two weeks ago my colleague Chethan Kumar and I presented on SQL Server in Cannes, France for VMworld Europe 2009. This presentation was the culmination of six months of investigation that was started at VMworld 2008 in Las Vegas. At that event I heard so many concerns about SQL Server performance that I was resolved to identify the problems. I talked with every customer I could find that claimed that SQL ran at anything less than 70% of native. So many of these contacts claimed that they had measured SQL at 25% of native or worse, that I knew that something was going wrong.

First, let me show you a slide that Chethan presented at the show in Cannes:

sql_tuning.png

Chethan spent three months investigating SQL Server to find out how much he could improve virtual performance from the "out of the box" experience. As this figure details, the sum total of performance improvements was 15%. Here's another break-down of these results:

sql_tuning_summary.png

The only option that we found in ESX to improve virtual performance was static transmit coalescing, which is documented on page four of one of our SPECweb papers. Large pages and SQL's priority boost, which are best practices provided by Microsoft for SQL Server configuration, provide the largest gains in performance.

The key messages that we communicated to our audience were that a properly running SQL Server should run at 80% of native or better. In most production cases it can run at a performance indistinguishable from native speed. And if performance is lagging, there don't exist many changes that can be made to ESX that can yield and performance gains at all.

This begs the question: "If ESX can't be tuned to double SQL performance, what is causing these reports of terrible SQL Server throughput?" The great majority of the problems are coming from mis-configured storage. But a variety of other items such as poor hardware selection or use of the wrong virtualization software contribute to the confusion, as well. I've been documenting these issues in Best Practices for SQL Server on this community and will continue to update that document as more problems are discovered.

If you have a SQL Server running un-virtualized in your environment, I'd like you to try virtualizing it again. Follow our best practices document and pay close attention to your storage configuration during deployment. I feel confident that once you've setup your environment properly, you're going to like what you see.

2 Comments Permalink
0

For years now VMware has been providing products that enable a virtual desktop experience. Historically, this would occur in virtual desktops on our hosted products but in some cases virtualization of Citrix XenApp (formerly presentation server) could provide a large number of desktops off a single virtual machine. And more recently VMware View offers a means of hosting a large number of desktops on a single server where each is granted its own operating system instance. As the number of virtual desktops and alternatives for implementing virtual desktops has grown, the need for a benchmark that can compare the performance of these alternatives has arisen.

Desktop benchmarking is not new to the industry, as people have been using PCs for decades. But standards in virtual desktop benchmarking are non-existent. Some might argue that traditional tools, common to PCs for years, should be used. But there are several reasons why this is not true:

  1. Pre-virtual desktop benchmarking is built to completely saturate all memory and CPU resources provided. Fully saturating CPU on a single multi-way VM, as an example, results in far fewer VMs per host than is common in VDI deployments. Fewer VMs means less work for the hypervisor's scheduler.
  2. Existing desktop benchmarks are often throughput-based, as opposed to latency-based. Because existing tools want to differentiate between powerful processors and large amounts of memory, they're designed to pack more and longer instructions in each run than is common in virtual desktop deployments. Most desktop deployments won't run massive video renders but the response times of individual button clicks and window appearance is critical.
  3. No existing benchmarks are aware of the peculiarities of VM-based timing. VDI benchmarks need to be aware of this by either using host timing or invoking and measuring operations from remote, non-virtual locations.
At VMworld 2008 VMware presented a VDI workload that had been constructed from a collaborative effort between all VDI teams within VMware with review and qualification by several of our partners. The first measurements on this workload came from Dell and EqualLogic and we quickly made details on its characteristics available via white paper. Key features of this workload include:

  • A diverse set of applications (Word, PowerPoint, Excel, Acrobat, and Internet Explorer) common to business desktop deployments.
  • Load generation modeled after the most common VDI deployments.
  • Small (less than 500 ms) operation generation and measurement.
  • Host-based measurement and an architecture to support remote command invocation in the next release.

As an attempt at the world's first VDI benchmark, we're very pleased with our efforts. We found that it met the unique requirements in measuring virtual desktops of all kinds. And since it was generated with large group of internal collaborators and multiple partners, it's an excellent beginning at what the industry needs to standardize this process.

But today we realize that its just a beginning. I want to encourage everyone to bring your comments to VMware via this blog or the performance forums on what you think the characteristics of a industry standard virtual desktop benchmark should be. We'll never make one benchmark that meets everyone's needs and I suspect that there are even some common needs that will require significant development resources. But I expect that with your guidance and assistance in refining this workload we'll accelerate the process of getting this benchmark in a shape that the industry can embrace

0 Comments Permalink

Virtual Performance

Scott Drummonds works in a variety of performance areas at VMware: VDI, application best practices, competitive analysis, customer performance investigations, and outward bound communications. This blog will detail some of my musings on these subjects.

Communities