We currently have an SR open, but I thought a post to the community might also turn something up. Currently, a customer is running their ldap server in a Red Hat Cluster setup. They are looking to migrate to our ESX environment and are seeing some very poor write throughput with initial testing. Here are some numbers from the current environment:
Avg r=1004.60/thr (502.30/sec), total= 5023
Avg r= 677.00/thr (338.50/sec), total= 3385
Avg r= 595.00/thr (297.50/sec), total= 2975
Avg r= 726.60/thr (363.30/sec), total= 3633
Avg r= 954.80/thr (477.40/sec), total= 4774
Avg r= 888.00/thr (444.00/sec), total= 4440
Avg r= 887.20/thr (443.60/sec), total= 4436
and what the same test shows on their VM:
Avg r= 40.20/thr ( 20.10/sec), total= 201
Avg r= 19.60/thr ( 9.80/sec), total= 98
Avg r= 13.00/thr ( 6.50/sec), total= 65
Avg r= 14.00/thr ( 7.00/sec), total= 70
Avg r= 11.00/thr ( 5.50/sec), total= 55
Avg r= 33.60/thr ( 16.80/sec), total= 168
Avg r= 12.40/thr ( 6.20/sec), total= 62
Avg r= 58.00/thr ( 29.00/sec), total= 290
Avg r= 17.20/thr ( 8.60/sec), total= 86
The fibre-attached storage is coming from the same tier on the same SAN, so no differences there. I have tried a normal VMFS datastore .vmdk, a virtual RDM, and also a physical RDM - none shows any significant improvement over the others. Our SAN administrator tells me that there is very little in the way of IOPS on the lun and parity group at the time of testing and the switch ports show little activity, so I believe the bottleneck is further upstream.
Read/search performance is fine for the VM, in fact it is better than on the existing environment:
Avg r=10414.40/thr (5207.20/sec), total= 52072
Avg r=10835.60/thr (5417.80/sec), total= 54178
Avg r=10909.00/thr (5454.50/sec), total= 54545
Interestingly, running a suggested command of "time dd if=/dev/zero of=test count=500000" appears to show better throughput performance on the VM:
On the VM -
time dd if=/dev/zero of=test count=500000
500000+0 records in
500000+0 records out
256000000 bytes (256 MB) copied, 2.21985 seconds, 115 MB/s
real 0m2.258s
user 0m0.164s
sys 0m2.047s
On the existing environment -
time dd if=/dev/zero of=test count=500000
500000+0 records in
500000+0 records out
256000000 bytes (256 MB) copied, 7.06046 seconds, 36.3 MB/s
real 0m7.098s
user 0m0.150s
sys 0m6.458s
Has anyone seen something like this before?
Our environment:
ESXi 3.5 U4
VirtualCenter 2.5 U4
Dell PowerEdge M600
8 CPUs (Intel Xeon E5440 @ 2.83Ghz)
32GB RAM
ISP2432-based 4GB Fibre Channel to PCI Express HBA
Broadcom NetXtreme II BCM5708 1000Base-SX
Sun 9990 SAN
Dual-ported fibre channel disks in a RAID5 (7+1) configuration
The customer's VM:
RHEL 5 (64-bit)
VMware Tools installed
4 vCPU
~5.5GB vRAM
ldap installation:
Sun-Java(tm)-System-Directory/6.3.1 B2008.1121.0522 (32-bit)
Database Cache: 500M
Entry Cache: 1.5G
Max File Descriptors: 8192
Max # Threads: 1000
Maximum # of Persistent Searches: 30
All_IDs Index Threshold: 10,000
of Entries Currently in the Directory: 1.2 million
Have you tried with lowering the vCPU's? Is that a requirement? I have heard that there may be a virtualization overhead causing performance degradation if the application is single threaded and the virtual machine is assigned multiple vCPU.
Yes, that was the first thing VMware support had us try. They have run the test on 1,2, and 4 vCPUs - and actually the results got worse with fewer vCPUs.
Could it be the JVM heap or runtime settings such as is described here, it looks like you set the db cache size
...
For example, if you were importing 200K entries, you might specify 2 Gbytes for the JVM heap size, then allocate at least 1 Gbyte for the directory server runtime environment and the rest for the DB cache.