Solved: Intel turbo boost not giving expected performance ...

Jonas_B · ‎10-25-2014

Hi,

We have a couple Dell R730 hosts with 2xE5-2699 CPU in them. Base frequency is 2.3Ghz and Max Turbo is 3.6 according to specs.

Intel turbo boost is enabled and we are currently running 1 VM on theese hosts with 4vCPU, the VM is configured as High Latency sensitive. When the cores are maxxing out on the VM we are seeing a max Mhz usage of 11250 which is 2812,5Mhz per core which is 2.8Ghz. I was expecting to see 3.6Ghz since the host is only running 1VM with 4cores and the host has 18Cores (per CPU).

Anyone has had any experience with this CPU and turbo boost? Is there perhaps some configuration we are missing?

Regards

Jonas

JarryG · ‎10-26-2014

You can find those multiplier in cpu-datasheet. For 2-core load you calculate frequency as:

base_frequency_[GHz] + (multiplier * bus_clock_[GHz]) = 2.3 + (13 * 0.1) = 3.6 GHz

For 1 or 2 core load, multiplier is 13 (that number sequence starts with load for all 18 cores, then 17, 16, 15, etc, down to single core). Now in your case, VM has 4 vCPU. You'd expect to use multiplier 10 and get 3.3GHz, but nope! Multiplier 5 is used. Why? Because of the feature called "thermal load ballancing": ESXi "rotates" that 4vCPU load over all pCPUs/pCores, so that all pCPUs/pCores are moderately loaded (therefore in your case multiplier 5 is used). Every modern OS does this (even Windows). Otherwise some portion of CPU-chip would be extremly hot, while other parts were cold.

If you had Windows installed directly on physical machine, you could prevent it using "cpu affinity". You can probably do it in ESXi too, but I DO NOT recommend it, unless you are ready to burn/wreck your pCPU. If only 2 of 18 pCores (on each pCPU) are at 100% and remaining close to 0%, thermal stress on the boundary of loaded and non-loaded cores is extremely high. This could really over-stress pCPU up to the point leading to physical damage (even with very good cooling), if you run it for longer time...

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! 😉

View solution in original post

JarryG · ‎10-25-2014

E5-2699 (with base frequency 2.3GHz) does have max. turbo frequency 3.6GHz, but only when you are running single intensive thread (single core load). It has turbo-multipliers 5/5/5/5/5/5/5/5/5/5/6/7/8/9/10/11/13/13. So with 4 cores loaded the turbo-multiplier you get is just 10...

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! 😉

Linjo · ‎10-26-2014

Why setting latency sensitivity to high? That disables some optimizations and high performance features, this will usually reduce performance of the workload.

// Linjo

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".

Jonas_B · ‎10-26-2014

Ok,

Where did you get thoose turbo-multipliers from? And how does one calculate estimated turbo-frequency for i.e a 2-core workload?

x * 13 = ??

Jonas_B · ‎10-26-2014

Theese are MS SQL VM's and we have done alot of testing with both HammerDB and also replays of our own workload and setting the VM's to latency sensitivity High gave us the most performance.

We will not overcommit the hosts, every SQL VM will have "their own" cores.

JarryG · ‎10-26-2014

You can find those multiplier in cpu-datasheet. For 2-core load you calculate frequency as:

base_frequency_[GHz] + (multiplier * bus_clock_[GHz]) = 2.3 + (13 * 0.1) = 3.6 GHz

For 1 or 2 core load, multiplier is 13 (that number sequence starts with load for all 18 cores, then 17, 16, 15, etc, down to single core). Now in your case, VM has 4 vCPU. You'd expect to use multiplier 10 and get 3.3GHz, but nope! Multiplier 5 is used. Why? Because of the feature called "thermal load ballancing": ESXi "rotates" that 4vCPU load over all pCPUs/pCores, so that all pCPUs/pCores are moderately loaded (therefore in your case multiplier 5 is used). Every modern OS does this (even Windows). Otherwise some portion of CPU-chip would be extremly hot, while other parts were cold.

If you had Windows installed directly on physical machine, you could prevent it using "cpu affinity". You can probably do it in ESXi too, but I DO NOT recommend it, unless you are ready to burn/wreck your pCPU. If only 2 of 18 pCores (on each pCPU) are at 100% and remaining close to 0%, thermal stress on the boundary of loaded and non-loaded cores is extremely high. This could really over-stress pCPU up to the point leading to physical damage (even with very good cooling), if you run it for longer time...

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! 😉

Jonas_B · ‎10-26-2014

Ok!

Thanks alot JarryG for clearing this out for me

All

Intel turbo boost not giving expected performance - vSPhere 5.5 - Intel E5-2699 CPU