Hello everyone, I currently has an ESX pool of VM's all on my SAN, I'll preface this with there is things I know i can tune in my SAN to make things better which is going to happen. However I have a problem where at night there is disk contention with the VM's and they're all sharing a set of disks so if a small # of vm's are doing crazy IO it hurts them all.
Is there any tools that come with or even that are third party that would help me narrow down which VM Guests have a high demand for IO?
To find this we need to take a look at what is going on at night. Some questions:
- There is any process schedules for this time? This include backups, Anti virus scans, any sort of data indexing.
- The problem occurs always at the same time? What time is that?
- Can you paste here a sample of the exit of "cat /var/log/vmkernel | grep StorageMonitor"?
well there is a couple things. There is backups that run but the problem shows up "randomly" so at night there obviously is going to be various jobs and they run at night. What we've found is that in the morning sometimes whatever the servers are doing is still going. The problem being is that the production environment (the one effected) has about 90vm's running so to go to the guest level of each and try to look at the scheduled job is a pretty big task as god knows what the developers do sometimes.
I'll try to past you that var/log/vmkernel thing.
The backup are being done with VCB? If yes, with SAN connection or NBD?
no currently its done at the guest level. I have checked when this issue is occuring into the morning though that there is no backups running. So I know that this can be a major contributor but in this case i'm 99.9% sure its not the cause of my current symptom.
It would probably be easier to look at the IO stats for the hosts. You can also log these overnight and get a general idea of which vm is using the most.
esxtop, v will show you the io stats for the vm's by name, select F to adjust what you want to see. I'll look up the batches tomorrow for you so you can look at the statistics long term.
ESXTOP, check the following blog article: http://communities.vmware.com/docs/DOC-9279
especially look in to: davg / gavg / kavg
Duncan
VMware Communities User Moderator | VCP | VCDX
-
Blogging: http://www.yellow-bricks.com
If you find this information useful, please award points for "correct" or "helpful".