I have 2 Exchange 2007 sp2 rollup4 "mail store" only servers with under 500 users a piece. They are configured alike in the following way:
OS: Windows 2008 DC x64 R1
60 GB OS Partition - VMFS Partition (30 GB's or so free)
4-6 Mail Stores spread out over 4 vRDM's
- Both the DB and Logs are on the same vRDM.
Both VM's are running on vSphere 4.0 Update 1.
The vRDM's recieve what I would consider to be a good amount of I/O traffic (measured by our I/O traffic tool), but the VMFS partition that the OS partition resides on (along with 10 other VM's) recieves a LOT more I/O then the vRDM's. Reviewing disk stats in esxtop, it's definately my Exchange servers that are generating the bulk of the I/O.
We have 6 GB's of RAM on the Exchange server, and current mem utilization is 95%. We also have Eset NOD32 running on the server, but we exclude scanning:
- C:\program files\microsoft\Exchange Server\*.*
- The root of each of the vRDM's.
Is the amount of I/O that my OS partition seeing normal for Exchange? How would I best diagnose the issue to see what is causing the I/O on the OS partition.
Thanks for any help you can offer.
Where you place the OS Page File? for this configuration i think that you have low RAM memory, in my customers, for 500 mailboxes, i recomend to use 16 GB of ram.
In my experience in exchange AV, i use Trend AV, this is the better in performance and protection.
Most likely shortage on RAM and excessive page file use.
Exchange 2007 likes to use memory more then disk access (reduced number of IOPS from 2003 considerably)
On the server 2008 OS , check for a high amount of hard page faults in performance monitor. This will indicate a shortage of ram.
This link will explain it better:
I would bump it up at least 2GB of RAM and work from there. Now exchange 2007 is tricky with allocating RAM because it will use up most of the available ram whether it needs it or not. So if you add 2GB more, or 16GB more, you will still see 90% and over usage on memory. This does NOT mean it's short on ram.
If you do not have ram available, I would recommend creating a separate LUN and drive on high speed disk (Fibre Channel) and relocating your system page file to this drive.
You should check on a Windows level it is paging or not. Scanning could also still be causing it, I would check those logs/monitor to see what it is accessing. Or just disable it for 10 minutes and see if the behavior changes.
Available now on Amazon: vSphere 4.1 HA and DRS technical deepdive
Here is some additional information.
Current Virtual Memory Settings.
I let perfmon run for a few minutes during typical workload, and see the following:
Page Faults / sec: 300-400 (average)
Page Reads / sec: 1.2 (average)
Page Writes / sec: 0 (average)
From what I'm reading, it does not appear that my system is doing a lot of actual Page Reads that require disk i/o.
I suppose I should have mentioned this from the get-go as this makes a lot more sense to me. We use Veeam 5.x for backups, and we have very agressive RPO objectives for our Exchange servers, so much so that they literally always running backups, hence they have always have a snapshot on them. Would it make sense that the heavy I/O that the VMFS volume is seeing is actually all of the data being written to snapshot?
I have confirmed that the high I/O seen by the VMFS partition was indeed because of the Veeam backups running against it. While the job would run, all the snapshot data is written against the VMFS volume, which was amounting to around 12 GB in a 4 hr period. Thank you all for your recommendations.