Hi,
I have a very unusual problem – has anyone seen this
Adding more RAM to a 2008R2 vSphere VM server caused a performance decrease
Environment
In our case the servers were deployed from templates 2008R2 Ent templates
Template 1 – 4CPU, 8GB RAM, 50GB OS Disk, single NIC
Template 2 – 2CPU, 4GB RAM, 50GB OS disk, single NIC
VM Hardware 7
vSphere 5.0 U1
The vSphere environment is not overlay stressed, there is free host CPU, RAM Network and SAN disk resources for the VMs to use
In house .NET 3 tier application
There is a presentation (test01 and 02), middleware and DB server
The presentation servers are querying a data base, via a middleware servers and returns a list (note: this application is running in 3 separate locations AWS, private vSphere and another vSphere public cloud)
Both TEST servers use the same middleware and DB server
Each application tier resides in a separate subnet
Network latency is 1ms
No firewall rules between application tiers
Same DNS servers
Same route tables
Same trace routes to middleware servers
Initial Tests
The slowdown is across a green field virtual Citrix farm (scripted install)
To rule out multiple software, environment, and hardware changes within the CTX deployment process, we deployed 2 new servers and only executed the very first change of several. Add more RAM
TEST01 Deployed from Template1 @ 4 CPU, 8 GB RAM and, 50GB HDD
Initial testing showed that the DB query returned results in 1 minute
TEST02 Deployed from Template2 @ 2 CPU and 4 GB ram
Initial test showed that the DB query returned results in 1 minute and 30 seconds
Adding more RAM
TEST01 was changed to 16GB RAM - DB query returned results in 1 minute and 10 seconds
TEST02 was first changed to 8GB of RAM - DB query returned results in 2 minute and 30 seconds
Resolution
We have come across this unusual way to resolve the problem (OPS guy overheard the discussion and mention it was experienced in the private vSphere environment, which the cloud team weren’t aware of)
Shut Down the Server
Change the Server from a multil vCPU server to a single vCPU server.
Boot and accept hardware changes
Shut down and change back to a multi CPU server
NOTE: RAM values are not changed
Actions and results
TEST01 server - was 4CPU, changed to 1CPU and then back to 4CPU, with reboots in between - Tested and DB query returned results in 1 minute, same as original results
TEST02 server - was 2CPU, changed to 1CPU and then to 2CPU, with reboots in between - Tested and DB query returned results in 1 minute and 30 seconds same as original results
Thanks
Mike
Interesting... but I wonder what would happen with a physical Windows box. Perhaps Windows is having issues with the changing memory, and changing to/from single CPU to multi CPU causes Windows to rescan something?
Based on those numbers, I wouldn't expect much of a difference, but try running the same tests with a single vCPU. Sometimes a single vCPU can actually be faster than multiple vCPU's due to the way ESX handles cpu scheduling.
Interesting indeed...
While the query is running I would be checking ESXTOP on the ESXi host for CPU Ready values and also the memory stats page for balloon driver inflation or memory being swapped to disk for starters.... I would also check the db server performance counters and compare the same query results from an existing server to see if the results are comparable or not.