Solved: Re: How do you performance test?

juchestyle · ‎05-17-2007

Hey guys,

This question might be multi faceted. The question is, how do you measure performance on an esx or a vm?

What kind of testing do you do when you get a new ESX host, on the hardware, or on the software, like the networking, the cpu, the disk etc; or a new VM?

We are trying to think about how to guarantee users that their vm's will perform at a certain level, consistently and over time?

We have thought about putting new servers and applications on a physical server first, testing everything and then moving them to vm's. We have thought about this because we are concerned that there will be degradation of services (cpu, memory, disk, or NETWORKING) as the load on that ESX changes. The other reason for this thinking is that we want to be able to tell our users that their vm's will perform within a certain percentage of a physical.

Right now we are just in the talking and brain storming phase and we wanted to get some feedback from you rest of you, cause you are that good.

What are your thoughts, your ideas, your critiques? I have mad points to give away!

Respectfully,

Kaizen!

SyverDude · ‎05-21-2007

I find iometer helpful but only in a limited way. . . .

It is useful for creating a baseline benchmark on hardware configuration A that is running an application with some kind of user performance expectations.

You can then take the same benchmark settings and run it on Configuration B, say where the storage changed. This will give you some kind of relative information as to the performance differences due to the hardware changes.

Of course real life benchmarks are very difficult and hard to set up. this is kind of a quick and dirty approach but can be useful information when comparing storage technologies.

Regards,

Jon

View solution in original post

mreferre · ‎05-18-2007

With "multi faceted" do you mean "big mess" ?

There are two main facets here in my opinion:

\- the first one is how do you rate a vm Vs physical with no contention on the ESX host. This is relatively easy to achieve but the problem is the benchmark you use. Ideally you need to use a benchmark that is relevant compared to how your application behaves in reality but since it's not easy for an organization to run such sofisticated benchmarks which requires either specilized software and/or a huge infrastructure to run the benchmark (i.e. many clients etc etc) many defaults to toy-like utilities and patterns such as SiSoft Sandra and/or the xcopy of a big file etc etc etc which has nothing/little with the real appl behaviour. The best thing would be to test your real application in a vm and compare the difference.

2- the second is even worse then that ...... and has to do with the implications of running multiple workloads on the same server. So in addition to the ESX overhead you have to account for resource contention on the same box due to the overcommitment. In addition to the basic overhead and the resource contention you get additional overhead due to the %Ready thing.

Again this is a big mess and certainly the tools needs to mature to give us a clear picture.

Massimo.

Massimo Re Ferre' VMware vCloud Architect twitter.com/mreferre www.it20.info

juchestyle · ‎05-21-2007

Hey Massimo,

Yup, big mess!

Anyone running baseline tests on their ESX before deployment?

What about base lining a physical before virtualizing it?

Respectfully,

Kaizen!

oreeh · ‎05-21-2007

Anyone running baseline tests on their ESX before deployment?

IMHO this doesn't make much sense. You want the maximum performance out of your VMs and not the ESX itself.

As Massimo already said - the problem is the benchmark.

There are a few "application" benchmarks (like TPC) which could be useful.

But who runs a TPC benchmark against a dual-CPU system - nobody that I'm aware of.

juchestyle · ‎05-21-2007

Oreeh,

I thought I read in our other post about severe network performance, I think it was the guy who talked about IRQ conflicts that they tested their ESX for networking ability in their test lab before rolling it out to the production offsite location. And when they ran it in production, they found that it had lost networking speed.

Cases like this make it seem like baselining the hardware does make sense?

What do you think? I have always assumed that the hardware when you buy it new works. But maybe it is time in my career to stop making that assumption.

Thoughts?

Respectfully,

Kaizen!

oreeh · ‎05-21-2007

The problem is the vmKernel in between.

There's no way (at least none I know of) how to baseline the vmKernel too.

Baselining the HW in an ESX environment only makes sense when you are able to baseline the vmKernel too.

Nobody knows (except the engineers/developers) how much performance is lost in the kernel.

juchestyle · ‎05-21-2007

What about overall performance?

Steve and I were talking these points and had some interesting ideas:

Virtualizing servers when their hardware needs replacing probably means that your new vm will perform better (ie, if the hardware is out of life you are probably going to virtualize an older 1 gig processor to your new vmware host that has 2 or 3 gig processor.)

But what if you want to virtualize current applications for consolidation or DR or some other valid reason? Two likely scenarios here:

Scenario 1: You are still upgrading the application to faster hardware. Physical was at 2.8 gig cpu and host runs at 3.4 gig cpu.

Scenario 2: You are moving the server from a faster box to a slower box. Physical was at 3.4 gig cpu and the host runs at 3 gig cpu.

Of course there are 4 components to think about: cpu, memory, network and disk.

From a cpu perspective you are loosing ground, especially considering that the ESX host is not only faster, the same, or slower, but it is also running many other vm's. But what about overall performance? In many cases that ESX is probably running the server on SAN storage which may be faster than some older hard drives or Raid configurations. Maybe not. Since bottlenecks happen more often in the speed of the hard drive, could it be that moving an app to a host with SAN storage might still provide faster or better overall performance because the cpu is waiting less for the storage?

In this case I would like to be able to see the 4 components balanced out and get an over all picture. Slower cpu may not mean slower performance if the other components are improved.

Respectfully,

PS. I feel a new icon coming up!

Kaizen!

oreeh · ‎05-21-2007

In many cases that ESX is probably running the server on SAN storage which may be faster than some older hard drives or Raid configurations.

Maybe - but as our network performance thread revealed this isn't the case always.

Your are right that a slower CPU isn't an issue in most cases, especially since most current servers are heavily oversized.

The problem still is how to measure the overall performance.

Physical benchmarks are almost unusable in a virtual environment and the virtual benchmarks aren't really usable in a physical environment.

I guess if I had a clue on how to do this I'd be rich.

You can of course run load testing stuff against the VM - but this is a big effort and you need a bif environment to do it properly.

IMHO the best / easiest perofrmance test is the end user :smileygrin:

If he's whining then you know you did something wrong.

If he doesn't say anything you did it right,

If he's enthusiastic be careful

Congrats to the crown!

Now you don't have to borrow mine for the weekends anymore :smileygrin:

sbeaver · ‎05-21-2007

>If he's enthusiastic be careful

Or stop and get a drink

Steve Beaver
VMware Communities User Moderator
VMware vExpert 2009 - 2020
VMware NSX vExpert - 2019 - 2020
====
Co-Author of "VMware ESX Essentials in the Virtual Data Center"
(ISBN:1420070274) from Auerbach
Come check out my blog: [www.virtualizationpractice.com/blog|http://www.virtualizationpractice.com/blog/]
Come follow me on twitter http://www.twitter.com/sbeaver

**The Cloud is a journey, not a project.**

juchestyle · ‎05-21-2007

Thank you,

Instead of borrowing your crown and high heels on the weekend, could I borrow your users? I have never heard anything from end users but whining!

Why do you say baseline testing isn't relevant across physical / virtual?

Respectfully,

Kaizen!

oreeh · ‎05-21-2007

Instead of borrowing your crown and high heels on the weekend, could I borrow your users? I have never heard anything from end users but whining!

It took me years to train them this way...

IMHO baselining isn't relevant / to complicated with ESX - since the VMkernel can't be properly measured.

To get real numbers when benchmarking ESX you would need a server with ESX on top but WITHOUT VMs.

How would you be able to measure without running VMs?

You can of course do it with running VMs - but this will measure the VM too.

My guess is that the engineers have some tools to measure the VMkernel.

If we had these tools we could benchmark the whole stuff, have valid numbers AND wouldn't have the need of guessing.

The network performance thread is a good example for this.

We all expect a problem somewhere within ESX itself and what do we do?

We measure VM performance compare them with physcial and see a difference.

What we don't see is where (VM, NIC driver, scheduler,...) the performance is lost.

Message was edited by:

oreeh

Because of this I combine the user benchmark with some guessing based on experience.

Some might call this baselinig - I call it reading the crystal ball.

Among the blind the one-eyed is king

sbeaver · ‎05-21-2007

I ask my question, shake up my 8 ball and look for the answer it gives

Steve Beaver
VMware Communities User Moderator
VMware vExpert 2009 - 2020
VMware NSX vExpert - 2019 - 2020
====
Co-Author of "VMware ESX Essentials in the Virtual Data Center"
(ISBN:1420070274) from Auerbach
Come check out my blog: [www.virtualizationpractice.com/blog|http://www.virtualizationpractice.com/blog/]
Come follow me on twitter http://www.twitter.com/sbeaver

**The Cloud is a journey, not a project.**

mreferre · ‎05-21-2007

Oliver,

I think that it's even worse than that. the vmkernel doesn't have an absolute overhead as it (the overhead) is a function of the workload.

So the engineers have no way to measure the overhead in absolute numbers. It's a matter of what you run on top of it that determines the overhead. In the final analysis there are only two numbers that counts :

\- # of users supported by a given service

\- response time of the given service.

The problem is that for each of this services you can create a baseline to compare physical vs virtual so that you can get an idea of the overhead for THAT SPECIFIC WORKLOAD (measured in # of users and response time). The problem is when you start mixing workloads on a given server and specifically oversubscribing resources.

So in the end the overhead is not only a function of the specific workload ... but it's rather a function of how you mix different workloads on a single system.

Let me say it again ... a big mess.

Massimo.

Massimo Re Ferre' VMware vCloud Architect twitter.com/mreferre www.it20.info

sbeaver · ‎05-21-2007

Oops wrong thread. Hey its Monday

Message was edited by:

sbeaver

Steve Beaver
VMware Communities User Moderator
VMware vExpert 2009 - 2020
VMware NSX vExpert - 2019 - 2020
====
Co-Author of "VMware ESX Essentials in the Virtual Data Center"
(ISBN:1420070274) from Auerbach
Come check out my blog: [www.virtualizationpractice.com/blog|http://www.virtualizationpractice.com/blog/]
Come follow me on twitter http://www.twitter.com/sbeaver

**The Cloud is a journey, not a project.**

oreeh · ‎05-21-2007

Massimo, a nice explanation why the mess is so big!

juchestyle · ‎05-21-2007

Steve, you are not knowing where you are posting?

Laughiing

Kaizen!

SyverDude · ‎05-21-2007

I find iometer helpful but only in a limited way. . . .

It is useful for creating a baseline benchmark on hardware configuration A that is running an application with some kind of user performance expectations.

You can then take the same benchmark settings and run it on Configuration B, say where the storage changed. This will give you some kind of relative information as to the performance differences due to the hardware changes.

Of course real life benchmarks are very difficult and hard to set up. this is kind of a quick and dirty approach but can be useful information when comparing storage technologies.

Regards,

Jon

Gabrie1 · ‎05-21-2007

Hi

I think the only way you can benchmark a VM to a physical system is to measure transaction times, which is quite difficult. Most systems deliver a service, like looking up a record in a database and presenting the results to another application.

Only real comparisson would be when you could execute some transactions from a users point of view on a physical server, time them, then run them on a VM and again time them.

Trouble is that to be able to do this the server normally has quite some connections to other systems and it is not always possible to copy your whole environment to a test lab. Should you run them in production environment, then you would have to watch out for keeping everything in sync. Also not an easy task.

Gabrie

http://www.GabesVirtualWorld.com

All

How do you performance test?