Dear VMWARE community,
I would like some feedback about this VMware local/shared disk performance experience, this may end up as not being a problem but a unique performance increase. My environment is the following:
Dell Power Edge 2900 Server (Quad Core, 16 GB Mem, 10 x 300GB 15k SAS with Perc 6i Controller (LSI) 256MB Cache) (Qlogic 2460 4GB)
Local Storage Disk (Disk 0,1 - Raid 1) (Disk 2,3,4,5 - Raid5 - A) (Disk 6,7,8,9 - Raid5 - B)
SAN (CX-10) - (Disk 0,1 - Raid 1) (Disk 2,3,4,5 - Raid5 - A) (Disk 6,7,8,9 - Raid5 - B) on 4GB 15k 3.5" Fibre Channel disks - Second Tray FC (4GB Interface)
SAN (SAN Melody Box Ver 3.0/Win 2k8 64Bit) - (Disk 0,1 - Raid 1) (Disk 2,3,4,5 - Raid5 - A) (Disk 6,7,8,9 - Raid5 - B) on SAS 2.5" Drives 10k - Second Tray FC (4GB Interface)
ESX 3.5 patched to 153875 (update 4)
Create a Windows 2003 SP2 virtual machine which has a C drive on Raid 5A and a D drive on Raid 5B (1 x CPU and 2GB mem), used eagerthickzero to zero out vmdk files so test results would be correct. Cloned server to local storage.
Test: Copy our SQL backup files 15GB each via Windows Explorer from C to D drive and observe speed / time. No other vm's are present on the esx server or local/shared storage / FC fabric.
Result: Local storage will copy files in 2 and a half minutes 100-120 MB/s and SAN (EMC CX-310 or San Melody) will copy at 65MB/s at roughly 5 mins - MB/s results from esxtop.
Your probably saying right now (like the other engineers/technicians which have seen this issue) "that's not possible, local storage is slower than shared storage. You don't know what your doing". Maybe I don't, but no technician has been able to increase the performance of the shared storage. Below are my many attempts to explore this performance issue further.
Exploration of Performance Issues
Changed queue depth settings / throttle execution to 8, 32, 64, 100, 128 (changed the Disk.schednumreqoutstanding even though there are no competing vm's) - no difference to 5 minute file copy result.
Tried EMC CX-10 or SanMelody shared storage box with both having similiar 5 minute results (San melody box has indicated its cache (12GB) is not being pushed)
Ran IOmeter (max throughput 100% write) on both luns Raid5b (local / shared storage) Local storage results 520MB/s wr 8000 Cmds San Melody results 220MB/s wr 4000 Cmds (San melody was out performing CX-10) - Esxtop results
Transfered 15GB backup file to San Melody server and copied backup file from C to D drive and observed time taken - 2 and a half minutes. Time taken is the same as the virtual machine local storage test (assumption - disks can perform as well as esx local storage).
Added disks to both SANS but performance did not increase Raid 5 with 6 disks.
Upgraded Virtual machine (windows 2003) MS Storport driver and LSI SCSI driver on with no performance increase.
Set shares to high on vm shared storage disk, even though no other vm's are competing for resources - no difference.
* Performance increase to both vm's on local and shared - Created Windows 2008 vm with the same disk configuration Local storage performance increased was from 100-120Mb/s to 160MB/s. Shared storage increased from 65 MB/s to 80 MB/s.
Still to explore
Lun alignment - Not done yet, will do on windows 2008 / 2003 servers soon.
I will soon install windows 2008 locally on a Power Edge 1950 (No vmware) and hook up to the same luns and copy the 15GB file and monitor the result. Then insert Esx3i usb key on the same PE 1950 and conduct the same vmware tests as above. I will even try usb via esx 4 and see if there are any differences. This test should isolate the performance bottle neck to storage or vmware.
My current opinion on this exploration so far:
First up, you may say shared storage results are acceptable with four disks working in a raid 5 pack, there is no problem (please tell me that conclusion). Just keep in mind, the virtual guest OS is a LSI controller and it is talking to a real LSI (perc controller) controller and there may be a natural performance boost because both devices are similar. ( It could be the 1% explanation which may explain this issue). Because the local disk resource is not being shared with other esx servers some of the shared storage processes/protocols are not used and is very effecient, hence a performance increase.
Secondly, the virtual machine on shared storage is not pushing either SAN (San melody or EMC San) hard enough under windows explorer when copying the file. The disk queues are not being pushed (monitored from ESXtop) and because the data is not being pumped hard enough from the vm to the queues, unlike the local storage which changes its quoue to 128 under esxtop. Changing the OS did have some impact on performance, as oppose to all other esx changes which did nothing (unexpected performance increase for all to digest). Adding disks to shared storage did not improve performance, because the work is not coming quick enough to either SAN from windows explorer. Could this problem be caused because windows is not aware of what hardware is actually underneath the hood and it is having some performance impact on the server.
I am happy to try any further testing / recommendations and if you have local storage, try this test and tell me your result.
Cheers,
Jason
I would like some feedback about this VMware local/shared disk performance experience, this may end up as not being a problem but a unique performance increase. My environment is the following:
Dell Power Edge 2900 Server (Quad Core, 16 GB Mem, 10 x 300GB 15k SAS with Perc 6i Controller (LSI) 256MB Cache) (Qlogic 2460 4GB)
Local Storage Disk (Disk 0,1 - Raid 1) (Disk 2,3,4,5 - Raid5 - A) (Disk 6,7,8,9 - Raid5 - B)
SAN (CX-10) - (Disk 0,1 - Raid 1) (Disk 2,3,4,5 - Raid5 - A) (Disk 6,7,8,9 - Raid5 - B) on 4GB 15k 3.5" Fibre Channel disks - Second Tray FC (4GB Interface)
SAN (SAN Melody Box Ver 3.0/Win 2k8 64Bit) - (Disk 0,1 - Raid 1) (Disk 2,3,4,5 - Raid5 - A) (Disk 6,7,8,9 - Raid5 - B) on SAS 2.5" Drives 10k - Second Tray FC (4GB Interface)
ESX 3.5 patched to 153875 (update 4)
Create a Windows 2003 SP2 virtual machine which has a C drive on Raid 5A and a D drive on Raid 5B (1 x CPU and 2GB mem), used eagerthickzero to zero out vmdk files so test results would be correct. Cloned server to local storage.
Test: Copy our SQL backup files 15GB each via Windows Explorer from C to D drive and observe speed / time. No other vm's are present on the esx server or local/shared storage / FC fabric.
Result: Local storage will copy files in 2 and a half minutes 100-120 MB/s and SAN (EMC CX-310 or San Melody) will copy at 65MB/s at roughly 5 mins - MB/s results from esxtop.
Your probably saying right now (like the other engineers/technicians which have seen this issue) "that's not possible, local storage is slower than shared storage. You don't know what your doing". Maybe I don't, but no technician has been able to increase the performance of the shared storage. Below are my many attempts to explore this performance issue further.
Exploration of Performance Issues
Changed queue depth settings / throttle execution to 8, 32, 64, 100, 128 (changed the Disk.schednumreqoutstanding even though there are no competing vm's) - no difference to 5 minute file copy result.
Tried EMC CX-10 or SanMelody shared storage box with both having similiar 5 minute results (San melody box has indicated its cache (12GB) is not being pushed)
Ran IOmeter (max throughput 100% write) on both luns Raid5b (local / shared storage) Local storage results 520MB/s wr 8000 Cmds San Melody results 220MB/s wr 4000 Cmds (San melody was out performing CX-10) - Esxtop results
Transfered 15GB backup file to San Melody server and copied backup file from C to D drive and observed time taken - 2 and a half minutes. Time taken is the same as the virtual machine local storage test (assumption - disks can perform as well as esx local storage).
Added disks to both SANS but performance did not increase Raid 5 with 6 disks.
Upgraded Virtual machine (windows 2003) MS Storport driver and LSI SCSI driver on with no performance increase.
Set shares to high on vm shared storage disk, even though no other vm's are competing for resources - no difference.
* Performance increase to both vm's on local and shared - Created Windows 2008 vm with the same disk configuration Local storage performance increased was from 100-120Mb/s to 160MB/s. Shared storage increased from 65 MB/s to 80 MB/s.
Still to explore
Lun alignment - Not done yet, will do on windows 2008 / 2003 servers soon.
I will soon install windows 2008 locally on a Power Edge 1950 (No vmware) and hook up to the same luns and copy the 15GB file and monitor the result. Then insert Esx3i usb key on the same PE 1950 and conduct the same vmware tests as above. I will even try usb via esx 4 and see if there are any differences. This test should isolate the performance bottle neck to storage or vmware.
My current opinion on this exploration so far:
First up, you may say shared storage results are acceptable with four disks working in a raid 5 pack, there is no problem (please tell me that conclusion). Just keep in mind, the virtual guest OS is a LSI controller and it is talking to a real LSI (perc controller) controller and there may be a natural performance boost because both devices are similar. ( It could be the 1% explanation which may explain this issue). Because the local disk resource is not being shared with other esx servers some of the shared storage processes/protocols are not used and is very effecient, hence a performance increase.
Secondly, the virtual machine on shared storage is not pushing either SAN (San melody or EMC San) hard enough under windows explorer when copying the file. The disk queues are not being pushed (monitored from ESXtop) and because the data is not being pumped hard enough from the vm to the queues, unlike the local storage which changes its quoue to 128 under esxtop. Changing the OS did have some impact on performance, as oppose to all other esx changes which did nothing (unexpected performance increase for all to digest). Adding disks to shared storage did not improve performance, because the work is not coming quick enough to either SAN from windows explorer. Could this problem be caused because windows is not aware of what hardware is actually underneath the hood and it is having some performance impact on the server.
I am happy to try any further testing / recommendations and if you have local storage, try this test and tell me your result.
Cheers,
Jason
Tags:
performance,
local,
storage,
shared,
disk