The application Gupta SQL 9.0.1 is now running in a VM. The Datastores are on a Netapp 6070 (Raid DP 41 FC Disks a 300 GB)
The VM uses 5 LUN's. C: Windows 2003 D:Data E: Gupta SQL : F: Gupta SQL Logs G: Pagefile
Before it run on a HP DL360 G3 (Smart 5i raid controller 64 MB Cache) with 6 x 36 GB and 3 Logical Volumes on raid 5 .
On the HP DL360 G3 the databse update statistics runs 1Hour 3 Minuten.
The VM on the Netapp needs to 2 Hours 14 Minutes.
Analyzed Application SQL Gupta with perfmon: Average transfer size /sec for SQL Log 64 KB Read and Write (E:) and for SQL log 4 KB Read and Write
Performance Tweaks:
1.) Datacore Uptempo: Only 5 Mintes Faster. I think the DB , which 30 GB , is to big.
2.) Partition Aligment for E: and F Partition : 1 Hour 45 Minutes. Improvement 29 Minutes .
3.) OS Filesystem Blocksize(Clustersize) change for E: to 64 KB (default 4 KB) no improvement
4.) OS Filesystem Blocksize changed for F: to 32 KB (default 4 KB) no improvement
5.) Separate LUN for Pagefile G: no improvment
6.) VMware VC changed Queue Depth to 128 : no improvement
Arrgh!!!! What the hell is the problem? Is it cache for random io?
Two other quick things:
General "rule of thumb" is not to increase queue depth above 64 on ESX (not that it is particularly relevant here since you had performance problems at the defaults, but there is always a concern in a shared SAN environment of overrunning the SAN storage controller)
Have another look at the "storage analysis" section of and particularily using ESXTOP to look at %USD (amount of queue depth being used), DAVG/cmd (latency in the SAN), KAVG/cmd (latency in the kernel), ABRTS/s (the number of commands being aborted (basically a timeout))
You can import esxtop data into perfmon to make it more readible. goes through how to setup esxtop to capture the data to a csv file, and http://communities.vmware.com/docs/DOC-5100 goes through how to get it into perform. Looking at things in perfmon with all the features available to you (esp with Vista/Windows 2008 performon) much bettern than trying to figure anything out looking at the esxtop "table" display...
There is a huge differents e between single and multithreed applications.
With multithreaded iometer testing (outstanding Io 25) the netapp 6070 I can get 180 MB/s for 100 % random read.
But for single threaded application it is only 7 MB/s (outstanding io 1) or slower for random read.
Read the following document. Page 20 and Page 21.
There are to paging files . One on c: and one g: .
We use Qlogic cards (PCI-x). Look at snia document for single threaded application.
I did following test: All run datacore uptempo
HP DL 360 with Raid Controller Smart 5i without a BBWC (64 MB Cache default 100 Read Ahead) 6 x 36 Disks. 3 Logical Volumes ,all Raid 5.
Iometer result 20 MB/s for 100 % read random and 1 out standing io
IBM DS 8000 with Raid 5 (7+1)
Iometer result 16 MB/s for 100 % read random and 1 out standing io
Netapp 6070 with raid dp 41 disks raid dp.
Iometer result 7 MB/s for 100 % read random and 1 out standing io.
I read about the raid levels 1 , 10 , 5 and 6 and sql, oracel documents and all tell that for sql logs raid 10 is use and for sql db allthough raid 10 should be used. Allthough a sun document telles that the raid level 6 (raid dp) can only achieve 66 % raid 5 performance.
I am allthough astonished about the new netapp performance accelartion module.
They say following:
"For read intensive random I/O applications that are latency sensitive this requires the configuration of a high number disks in order to provide user and application acceptable latencies."
And if you look at the snia document on page 20 at the Note "Single thread applications are extremly sensitive to latency ...", they point in the same direction.
For me applications that need high random read /write io should be placed on solid satet disks or use much cache.
I am very interested in the new intel solid state disks. Look at the iozone results . (sorry the text is german, but the random values tells for them self)
iozone KByte/sec | Intel SSD 80 GB SATAII | Seagate Cheetah 73.4 GB 15k rpm SAS | 2 xWD Raptor 36 GB 10k rpm RAID0 SATA | Maxtor DiamondMax 9 Plus 160 GB 7.2k rpm SATA |
sequential write 4 KB | 75'471 | 74'730 | 47'840 | 44'720 |
sequential write 16 KB | 76'178 | 87'147 | 50'381 | 45'463 |
sequential write 32 KB | 77'325 | 94'126 | 66'401 | 44'700 |
sequential write 64 KB | 76'484 | 104'642 | 68'319 | 41'458 |
sequential write 128 KB | 77'204 | 78'664 | 84'472 | 43'977 |
sequential write 256 KB | 77'257 | 87'703 | 89'999 | 49'007 |
sequential read 4 KB | 240'626 | 61'252 | 20'062 | 39'566 |
sequential read 16 KB | 240'505 | 69'315 | 21'810 | 43'703 |
sequential read 32 KB | 239'662 | 74'521 | 32'627 | 37'019 |
sequential read 64 KB | 239'235 | 81'718 | 39'331 | 39'904 |
sequential read 128 KB | 242'603 | 36'893 | 53'627 | 40'527 |
sequential read 256 KB | 240'404 | 71'301 | 49'105 | 43'344 |
random write 4 KB | 48'194 | 4'521 | 3'005 | 1964 |
random write 16 KB | 75'127 | 7'356 | 7'573 | 5640 |
random write 32 KB | 79'788 | 12'493 | 13'012 | 8'853 |
random write 64 KB | 81'474 | 22'158 | 20'132 | 12'756 |
random write 128 KB | 79'147 | 33'757 | 24'931 | 17'808 |
random write 256 KB | 80'460 | 47'147 | 35'806 | 23'862 |
random read 4 KB | 30'287 | 2'617 | 1'203 | 362 |
random read 16 KB | 95'552 | 4'504 | 4'331 | 1'451 |
random read 32 KB | 149'029 | 7'291 | 7'057 | 3'094 |
random read 64 KB | 197'591 | 12'840 | 11'432 | 5'775 |
random read 128 KB | 267'814 | 24'534 | 14'976 | 10'169 |
random read 256 KB | 205'981 | 26'387 | 18'218 | 12'364 |
I think 3.583 means "three thousand, five hundred and eighty three".
I dont think that spotlight, perfmon and sysinternals process monitor show a false used of memory. But i am not sure.
What if from 4 GB only 2 GB can be used.
4 GB in Windows menas, that 2 GB is used for Usermode and 2 GB for Kernelmode.
This means to me that in 32 bit Windows Standard Edition an application cannot use no more than 2 GB in usermode.
Kernelmode can use usermode memory, but usermode cannot use kernel memory!!!
The only was to use more usermode memory is to use the switch /3G in the boot.ini. I do not no if this works for the standard edtion of windows 32 bit
The SAN controllers typically have a bunch of cache already on them, however "more is better" holds true when it comes to cache.
Single threaded, sequential IO is definitely latency sensitive, and direct attached disk will typically outperform anything else just because of the physics (signal has to travel less than 1 meter and go through 1 controller rather than travelling 10s of meters and go through multiple controllers. That being said it is extremely rare to find any multi-user application (or any application at all for that matter) that is single threaded now days.
The thing that sticks out for me looking at your perfmon disk results is that reads look reasonable, but the writes are horrible. That usually points back to some kind of write cache issue (cache disabled, not being used or the application is requesting a synchronous "write through" that is being honoured.
I don't think that perfmon and sysinternals are "wrong" per se, I just think they're looking at counters that don't correspond exactly to the ESX/VC counters.
Spotlight's estimation of "hard" page rates just isn't supported by the perfmon disk numbers. If you're hard paging 3582 pages of memory per second, then it has to be going to the page file which is either on C: or G:. The whole G: drive is showing no IO at all so its not going there, and the C: drive is showing 4000 bytes/sec. The OS is certainly not paging 1 byte at a time, so the 3582 pages/sec number as well as the "cache hit ratio" that Spotlight is coming up with is suspect.
I removed the pagefile, which was on g: partition and allthough removed the g: partition.
Now there is only one pagefile on the c drive.
Then I changed the registry key DisablePagingExecutive in HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management .
Over the day only a pagefile of 10 MB (before 90 MB) was needed.
Then when the gupta db starts in the night, spotlight showed that form the 10 MB pagefile 2000-3000 pages /s were read.
Then I asked me , what is using the pagefile and I started the sysinternals process monitor while gupta was running and configured the path to c:\pagefile.sys.
Boom!!! A part of the dbntsrv process (gupta sql ) was paged out in pagefile. So is it now bad programed?
Only Physical Power will help: Fusion IO
So assuming the same behaviour applies to the physical servers you used to have .... are you saying that this usage pattern (well designed or badly designed - whatever it is) is very much penalized when running on ESX ?
Massimo.
We tried a new solution for Gupta application, because the customer was not satisfied.
It was implemented on a physical server.
We used a HP DL 360 G5 with 8 GB memory , a P400i with battery cache 256 MB and 3 x 36 GB (one spare) raid 1 and a second raid controller P800i
with 512 MB Cache with an external DAS with 10 x 36 GB Disks 15 K and Raid 5.
Now on we get 130 MB/s on the DAS!!! So at the moment DAS is Faster than any SAN!!!
I think it has definitly to do with the cache on the P800i controller and the different raid level 10. 5, 6
So in January 2009 we will test with the 4 x 16 GB netapp performance accelaeration cars, if can get the same or better performance on a san!!!
I've had problems with HP gear and ESX for years...
I have proven over and over again that you take ESX running to a Netapp NFS store create a VM, you get meager performance..
Remove Esx from said machine and install unbuntu and you get blazing speed.. Same hardware, same NFS store.
The answer....Esx binds the USB irqs to the console which only uses CPU 0. Because the irqs are shared with your storage controller and nics it also restricts all those to CPU 0.
We disabled USB and made sure the driver didn't start in esx. We got a 5x Througput increase
We went from 70MB/s on local vmfs storage to 333MB/s and were able to sautrate 1GBe at 100MB/s to the netapp 3070.
This has been true on any HP servers from like 5 G3+ that i've used.
See this thread of my suggestion being a success.