Solved: Re: Finding IO hog in vSAN

timalexanderINV · ‎05-17-2017

We have recently built a 10 node vSAN 6.2 cluster for our development and test environment. Have deployed SexiGraf and so far everything in the cluster looks ok. That said I have one host that is getting very strange IO:

Is there any way to see the what VM is behind this IO or what objects this ESXi host is the owner of? From my understanding it does not have to be a VM running on the host as data locality could be any host in the cluster.

TheBobkin · ‎05-17-2017

Hello Tim,

Correct it is not necessarily a VM(s) running on this host just any that have data components residing on this hosts disk-groups.

So, as with any performance issues a few things need to be established:

- Do these readings vary greatly from a longer-term workload baseline?

- Is this increased load negatively impacting performance of other VMs or just stands-out on a graph?

- Do these high-readings correlate with any specific times or activities? (e.g. Back-up jobs, provisioning/creating VMs, 9AM log-in/boot-storms, large resyncs, huge file-server transfers).

You could start looking at potential VMs that are causing increased load by looking at :

- VM metrics in vCenter:

pubs.vmware.com/vsphere-60/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-60-monitoring-performance-guide.pdf

- vSAN Observer (or Sexigraf if it gives drill-down of each disk) - specifically if you can identify single disks/disk-groups that see huge IO and at the same time see similarly high usage on another hosts disk/disk-group, then the data components of the VM responsible will be on both of these (assuming FTT=1 Objects).

- esxtop (for both disk IO and VM IO)

Great resource for this from depping:

http://www.yellow-bricks.com/esxtop/

You can also set up cron job to measure over a period of time:

kb.vmware.com/kb/1033346

Bob

-o- If you found this comment useful please click the 'Helpful' button and/or select as 'Answer' if you consider it so, please ask follow-up questions if you have any -o-

View solution in original post

TheBobkin · ‎05-17-2017