Solved: Re: slow performance with CentOS 5.5 guest and mu...

ajskalski · ‎07-09-2010

We installed a CentOS 5.5 guest postgresql database server in vSphere 4.

This guest "verik" (2 vCpu, 16gb RAM, PAE kernel) is the only guest running on the vSphere host (2x6 core AMD Opteron 2439SE, 128gb RAM).

We are seeing very poor performance (for example when running a pg_dumpall). The %CSTP seems quite high, especially compared to %RUN -- and surprising since there is no contention whatsoever for host resources.

9:38:02am up 17:54, 131 worlds; CPU load average: 0.01, 0.02, 0.01

PCPU USED(%): 11.2 0.6 0.4 0.3 0.3 0.2 0.4 2.1 0.3 0.3 2.9 0.1 AVG: 1.6

PCPU UTIL(%): 11.4 1.1 0.9 0.9 0.8 0.6 0.9 2.5 0.7 0.6 3.1 0.2 AVG: 2.0

CCPU(%): 7 us, 2 sy, 91 id, 0 wa ; cs/sec: 509

ID GID NAME NWLD %USED %RUN %SYS %WAIT %RDY %IDLE %OVRLP %CSTP %MLMTD

1 1 idle 12 1181.67 1184.71 0.00 0.00 15.87 0.00 2.47 0.00 0.00

2 2 system 7 0.01 0.01 0.00 700.00 0.00 0.00 0.00 0.00 0.00

6 6 helper 75 0.16 0.16 0.00 7500.00 0.05 0.00 0.00 0.00 0.00

7 7 drivers 9 0.01 0.01 0.00 900.00 0.00 0.00 0.00 0.00 0.00

8 8 vmotion 4 0.00 0.00 0.00 400.00 0.00 0.00 0.00 0.00 0.00

10 10 console 2 10.59 10.83 0.02 189.27 0.00 89.19 0.35 0.00 0.00

15 15 vmkapimod 7 0.03 0.03 0.00 700.00 0.00 0.00 0.00 0.00 0.00

17 17 FT 1 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00

18 18 vobd.4261 8 0.00 0.00 0.00 800.00 0.00 0.00 0.00 0.00 0.00

19 19 net-cdp.4269 1 0.00 0.00 0.00 100.00 0.00 0.00 0.05 0.00 0.00

20 20 vmware-vmkauthd 1 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 0.00

26 26 verik 4 5.12 4.75 0.35 348.10 0.05 18.24 0.51 47.30 0.00

We have tried a variety of 2.6.18 PAE kernels (including attempts with the CONFIG_HZ 100 and with divider=10 options). More or less consistently poor performance.

Host BIOS Settings are:

HyperTransport Technology: HT 3

HT Assist: Enabled

Virtualization Technology: Enabled

DRAM Prefetcher: Enabled

Hardware Prefetch Training on Software Prefetch: Enabled

Hardware Prefetcher: Enabled

Demand-Based Power Management: Disabled

We have also tried running this guest with 1 vCPU (2.6.18-194.8.1.el5PAE):

The first time run after a reboot:

time pg_dumpall>/dev/null

real 8m1.764s

user 0m12.649s

sys 0m3.133s

While this is running, the system is not responsive (e.g., a "top" that should update every second may update every 7 seconds). The guest is spending a lot of time in "system", doing what we're not sure.

Cpu(s): 17.3%us, 81.9%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.8%si, 0.0%st

This takes 7 minutes on a multi-core physical server (4x2.8Ghz Xeon).

This takes 17 minutes at best (often much longer) on "verik" with 2 vCPUs.

We'd really like to migrate our physical postgres server into VMware, but we cannot justify doing so unless it can perform at least as well as on physical hardware. (We have tried storing the database on local disk, NFS mount, and host datastore; similar performance in all cases. We have also verified adequate bandwidth on our links to storage and front-end networks.)

We have reviewed many postings on running guests with more than 1 vCpu and think we understand how vSphere CPU coscheduling works. But in our scenario, with no contention for CPU resources, we're not sure why we're having performance issues.

Any suggestions?

Thanks!

Craig

toha · ‎07-09-2010

First of, why 32-bit OS for VM with 16 GBs of RAM?

My first and very strong suggestion is to go for 64-bit version of CentOS, PAE is known to have poor performance. We run several 64-bit CentOS VMs and I can't say that we are having any unexplained performance problems.

You could try to tweak existing 32-bit OS a bit, but I do not expect big improvement. Keep that divider=10 in your kernel parameters but if you like try again with single CPU add "nosmp noapic nolapic" kernel parameters, those can drop CPU rdy time % a quite bit, at least I have seen it to happen on busy ESX host. Also assign some huge memory pages in Linux kernel and configure PostgreSQL to use them.

Your BIOS settings seem also to be non-optimal, general recommendation is to disable CPU prefetching features on ESX hosts since when CPU is executing multiple processes (VMs) prefetching results in high number of misses which is just waste of CPU cycles.

Hardware Prefetch Training on Software Prefetch: disabled

Hardware Prefetcher: disabled

But still, go for 64-bit OS.

View solution in original post

RParker · ‎07-09-2010

Any suggestions?

Yes you can't compare physical with virtual.

Try reducing the number of CPU in that VM to 1. Just TRY it.

toha · ‎07-09-2010

First of, why 32-bit OS for VM with 16 GBs of RAM?

My first and very strong suggestion is to go for 64-bit version of CentOS, PAE is known to have poor performance. We run several 64-bit CentOS VMs and I can't say that we are having any unexplained performance problems.

You could try to tweak existing 32-bit OS a bit, but I do not expect big improvement. Keep that divider=10 in your kernel parameters but if you like try again with single CPU add "nosmp noapic nolapic" kernel parameters, those can drop CPU rdy time % a quite bit, at least I have seen it to happen on busy ESX host. Also assign some huge memory pages in Linux kernel and configure PostgreSQL to use them.

Your BIOS settings seem also to be non-optimal, general recommendation is to disable CPU prefetching features on ESX hosts since when CPU is executing multiple processes (VMs) prefetching results in high number of misses which is just waste of CPU cycles.

Hardware Prefetch Training on Software Prefetch: disabled

Hardware Prefetcher: disabled

But still, go for 64-bit OS.

mike_sims · ‎07-09-2010

One thing to keep in mind when looking at multiple vCPU's is if the app is truly multi-threaded or not. I assume Postgre is (my DB ignorance showing there, sorry). If so, that's likely not an issue.

Also, is this PAE kernel capable of SMP? I know that's like asking "is it plugged-in", but gotta ask. I see that a lot when folks around here have issues with CPU, believe it or not.

Geez - I stand to learn a bit about Postgre and PAE form all of this...

toha · ‎07-09-2010

..out of office message deleted..

ajskalski · ‎07-12-2010

Thanks for your reply. Your suggestions helped a lot.

We installed a fresh CentOS 5.5 64-bit guest (1 CPU) (and made the two BIOS changes you recommended), and a pg_dump that previously took hours finished in just over a minute. That's the kind of performance we were expecting. I would not have guessed a 32-bit PAE CentOS 5.5 would have performed so much worse...

We are not using the "divider=10" and "nosmp noapic nolapic" kernel boot parameters. I wasn't sure from your comments if you used these with just the 32-bit kernel or if you use them with your 64-bit kernel also...? According to the VMware software compatibility notes, no kernel options should be needed at all with CentOS 5.4 (and I would imagine 5.5).

Thanks again for your feedback!

thakala · ‎07-13-2010

(changed my Communities username)

Well VMware documentation of Linux kernel parameters is about time keeping best practices, and for that documentation is correct, no additional parameters are required to archive accurate clock. Kernel parameters "nosmp noapic nolapic" work with uniprocessor 64-bit Linux VMs also, and setting those will improve performance at some degree, just remember to remove them once you see need to go for SMP.

Tomi http://v-reality.info

All

slow performance with CentOS 5.5 guest and multiple vCPUs