burdweiser
Enthusiast
Enthusiast

Resource monitoring (add CPU?)

So I'm having a great debate about assigning another CPU to a server that seems slow to some users. The server in question is running an app called "change gear". It is Win 2K8x64, 2GB of RAM and 1CPU (intel x7460 2.66Ghz from a Dell R900). Looking in vCenter, I see that the guest server is running at 49mhz and the CPU ready time in esxtop is around .03%. Seems to be running fine right? Well, in Spolight (quest software), we see that the server has a warning about processor queue lenghts and that there are about 800 threads running. Quest software is stating there is a bottleneck with the CPU and we should add a CPU to the server. Is this assumption correct?

I'm just not seeing the same results in vCenter.

0 Kudos
23 Replies
weinstein5
Immortal
Immortal

What is the Ready time showing for that VM in the VI Client performance graph

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
burdweiser
Enthusiast
Enthusiast

The VI clent show 3.84 average for ready time and 30 max. In esxtop it is around 500.

0 Kudos
burdweiser
Enthusiast
Enthusiast

I opened a case with VMware support. I'm noticing across all 1CPU server in my envrionment, processor queue lengths of 4 or more on average, but the CPU utilization is low, like 3%. This is on W2K3 and W2K8 x64 and x86 servers.

0 Kudos
RParker
Immortal
Immortal

Forget performance graphs, what does Windows show? Use perfmon, log it for a day or two, and track the CPU inside the guest. If it doesn't spike the CPU consistenly, there is your answer.

Also ready times are very low, which means the CPU is not in contention.

Also performance of a VM is from the DISK NOT the CPU responsiveness. With Windows especially, everything is on the disk, DLL's, Paging, executables, etc.. so what is your backend storage like? You may have some Queue time on the disk storage, but I am not blaming the storage, I am saying that VM's are less responsive overall, than physical. Users may be used to physical.

I can pretty much tell right now the only thing adding another CPU will give you, is more overhead on the ESX host, you won't gain any performance. But the first step is to see what Windows sees, using perfmon.

0 Kudos
burdweiser
Enthusiast
Enthusiast

Parker,

The quote from my last post is from perfmon. The queue lenght is around 4 and the CPU utilization is around 3%. Our storage is on a CX340 SAN using fiber channel. We only have about 15 VM's on these production hosts, nothing to heavy. Mostly web front ends and license servers. I've got some VM's that are sitting around and doing nothing but running an OS, but the processor queue length is averaging around 4 with hardly any CPU cycles.

0 Kudos
mcowger
Immortal
Immortal

I honestly wouldn't be surprised to see CPU queues. Dont forget that when the VM is idle, its taken off the CPU and descheduled for some amount of time (sometimes up to 100ms+). When it then gets rescheduled onto the CPU, kernel threads wake up, realize 100ms has gone by, and all schedule onto the vCPU for execution, then get executed. Then, the process repeats.

Small CPU queues like that would seem normal to me in a shared execution environment. Iw ouldn't worry about it till it started to get much larger (into the 30-40 range).






--Matt

VCP, vExpert, Unix Geek

--Matt VCDX #52 blog.cowger.us
0 Kudos
mcowger
Immortal
Immortal

(duplicate post - stupid in-aircraft wifi)

--Matt VCDX #52 blog.cowger.us
0 Kudos
burdweiser
Enthusiast
Enthusiast

So I've got two seperate system running IIS in my production environment. One server has 1CPU and the other has 2CPUs. These are W2K3. The one with two CPU's has almost no processor queues and the other with 1 CPU has 5 processes in the queue. I do not notice any proformance impacts on the VMs, but the user community will sure complain when they find out it only has 1 CPU.

How am I supposed to convince managment not to add CPUs to 1CPU VMs?

I'm still doing further testing. I'm going to migrate these to hosts with different processors and different LUNs to see how they do on a dedicated host.

0 Kudos
burdweiser
Enthusiast
Enthusiast

Here is an esxtop chart to show my further testing. The processor queue lenghts are the same and esxtop is the same as the Dell R900 cluster.

I moved these isolated VMs to a host with no other VMs running. The host is a Dell 2970 with 8 CPUs. As you can see, the %RDY time for server 1 is high because it has 2CPUs. The VM's are idle with no users hitting them. But managment is still looking at this processor queue lenght and not the %RDY like I'm trying to show.

0 Kudos
mcowger
Immortal
Immortal

Why does management or the community care what teh vHardware config is? Does the application perform to requirements? Thats the only thing that matters.






--Matt

VCP, vExpert, Unix Geek

--Matt VCDX #52 blog.cowger.us
0 Kudos
burdweiser
Enthusiast
Enthusiast

Well, I've got a supervisor that is passing on information to developers that I am "scaling back resources" on P2V'ed servers. Of course once the user comminuty finds this out, they scream and yell that the server is slow. Then I get the CIO of the company saying in meetings "just give it more RAM and CPU". This company really sees the host servers as a "resource buffet". There is no performance bottlenecks that I can see. I've just got this supervisor now who found this processor queue lenght "problem" with his Spotlight on Windows app as ammo against my methods and I have to explain why all of the 1CPU servers behave this way.

0 Kudos
mcowger
Immortal
Immortal

Seems to me that the following needs to occur:

1) Management defines the response times for the app

2) You make app perform to those requirements, regardless of how you do it.

You can add all the CPUs you want, but until you actually have defined metrics you need to achieve, your users will always call it 'slow'.






--Matt

VCP, vExpert, Unix Geek

--Matt VCDX #52 blog.cowger.us
0 Kudos
nonu
Enthusiast
Enthusiast

Referring to this technet article :- http://technet.microsoft.com/en-us/library/cc940376.aspx

Investigate the individual thread or threads of the process or processes running during a bottleneck to understand more about the activity consuming the processor. Monitor the following factors to understand how thread activity is contributing to the problem, whether the cause is a single process or multiple processes:

The number of threads in each process that is running during a bottleneck

The amount of processor time a thread is consuming

The priority level at which threads are scheduled to run

The amount of time the threads are using the processor in privileged mode

You can use performance counters to analyze thread activity and adjust thread scheduling to allow more processor time for bottlenecked processes.

Apart from adjusting the thread's scheduling priority, you cannot alter thread behavior without changing the program code of the associated application.

__________________________________________________________________________________________________________________________________________

Also this kind of behavior can happen when requests for processor time arrive randomly and if threads demand irregular amounts of time from the processor

Plus If it'a multi threaded application then I don't see any reason why we can't allocate one more CPU.

0 Kudos
burdweiser
Enthusiast
Enthusiast

But if I add a second CPU and I see the %RDY time go up in esxtop, that is a sign that the VM is going to have poor performance right?

0 Kudos
nonu
Enthusiast
Enthusiast

How much does the ready time increase by.. for a 2 VCPU machine 10% of ready time is acceptable..

here's my understand of ready time and their respective percentages.. values, also when using esxtop you need to look a CSTP to see if their any co-scheduling issues when using the 2VCPU or higher... value for CSTP should remain zero.. if this is true then I gues you can very well grant another VCPU...

5% for 1VCPU

10% for 2VCPU

20% for 4VCPU

0 Kudos
burdweiser
Enthusiast
Enthusiast

I see the %RDY time goes from 1% to @ 3% when I add a second CPU. Granted, this is not to bad, but we only have about 15 VMs on the host servers. We will be adding much more as time goes on.

0 Kudos
RParker
Immortal
Immortal

Thats the only thing that matters.

Hallelujah! And the TRUTH shall set you free! Bingo. That's it, the ONLY thing that matters is, performance of the APPS / OS.

0 Kudos
RParker
Immortal
Immortal

First of what does a supervisor, CIO have to do YOU managing the system? If they left you to do then they should LET you do it.

I hate thumb managers.

that I am "scaling back resources" on P2V'ed serversScaling by that you mean resource pools, or the slider for EACH VM? Still not a problem IF you can show they don't need it....

Of course once the user comminuty finds this out, they scream and yell that the server is slow. I got news for you, they do this ANYWAY. You could give them ALL they ask for plus some more and they STILL won't be happy, this isn't news... this is reality. People don't know WHAT they actually need, they only see that SOMETHING isn't right, maybe the app just sucks. How about that?

Then I get the CIO of the company saying in meetings "just give it more RAM and CPU"Your ready times prove otherwise, 2 vCPU high ready times = 1 vCPU is needed not 2.

There is no performance bottlenecks that I can seeThere you go. Did you run the vm consolidation tool on this puppy before you did the conversion? It may well mean that this Machine may not be a good candidate for virtualzation. Despite ALL the hype from MS, VM Ware, Zen et. al, SOME things cannot be virtualized... wow what a concept!

I've just got this supervisor now who found this processor queue lenght "problemLet me guess, he is a 'Linux' Guru of sorts, right? Tell him ESX is NOT Linux, and if he wants to learn how it works, have him pick up a book about ESX, becasue apparently he has no idea what he is talking about.

Spotlight on Windows app as ammo against my methods and I have to explain why all of the 1CPU servers behave this way. Wow my only problem here is I wish this guy were MY supervisor, I would turn the spotlight on him. you got it right ALL 1 CPU servers behave this way, enough said.... I think you did everything correct. Just as a test do you have hyper visor? Do a benchmark and show them you have the BEST solution (including your resource restirctions) and most of all don't let ANY one tell you how to do your job.

Either they hired the right man (YOU) for the job or they didn't. It's that simple. If they want to quarterback from their desk, then they will fall on their face, because they are grasping at straws and including technology that's clear they have ZERO knowledge on.

If only I was in close proximity we could put a muzzle on this guy....

Do you put that VCB logo on your internal emails? Maybe you should JUST to remind them of who has the training on this..... So they want to ADD CPU / Memory so does that mean that 12K they spent on your training is for nothing?

0 Kudos
RParker
Immortal
Immortal

But if I add a second CPU and I see the %RDY time go up in esxtop, that is a sign that the VM is going to have poor performance right?

Maybe, maybe not. As a rule yes, but not always. There may be times when it WILL improve performance. So it will vary with mileage.

0 Kudos