Hello forums.
Could use some expert insight for a problem I am having with some VM's. Resources utilization is not at capacity in any way. No CPU contention but VM's are running slow - and often types its bad or sometimes just slow. The VM's didn't have this but for a while now I know we have been adding more and more VMs to our lab so this could explain the Latency
I would like for someone to offer some assistance on using vscsiStats for troubleshooting vmdisk related problems. On the vm I was looking at in perfmon i noticed a somewhat long average disk queue length. I ran vscsiStats and got the following after 30 minutes. Please let me know if I understand this correct.
vscsiStats ran with latency string only included latency of IOs not read/write. Both disk are on the same datastore.
VMDISK 1
Histogram: latency of IOs in Microseconds (us) virtual
machine worldGroupID
min 121
max 14440236
mean 327165
count 15524
Frequency Histogram
Bucket Limit
0 1
0 10
0 100
1488 500
1929 1000
1453 5000
1618 15000
1688 30000
1070 50000
1546 100000
4732 100000
VMDISK 2 -
Histogram: latency of IOs in Microseconds (us) virtual
machine worldGroupID
min 159
max 14245418
mean 478629
count 3446
Frequency Histogram
Bucket Limit
0 1
0 10
0 100
1722 500
146 1000
136 5000
65 15000
59 30000
46 50000
88 100000
1184 100000
I really dont understand our storage layout very well and I apologize for that I am trying to learn it. We do use NetAPP but I dont think we use all NetAPP storage. These Disk are attached with 4 GBPS HBA that are SATA disk.
My Understanding from the above is the following:
4732 Writes => .1 tenth of a second
1546 Writes<= .1 tenth of a second
1070 Writes <= .05 hundredths of a second
My math may be a bit off but is this a lot of latency for disk i/o? Your help is appreciated.
Cheers,
Chad King
VCP-410 | Server+
If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
You are correct this is the latency per I/O. It means that large portion of your IOs have latency higher than 100 miliseconds, this is fairly high to be honest and I am not surprised if you would notice overall sluggishness. I wonder if you for instance have checked if your disks are aligned or not? This could contribute to this level of latency. I also wonder how many links you have going back to your filer?
Another thing worth investigating is the utilization of the processors of the filer.
Duncan
VMware Communities User Moderator | VCDX
-
Now available: <a href="http://www.amazon.com/gp/product/1439263450?ie=UTF8&tag=yellowbricks-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=1439263450">Paper - vSphere 4.0 Quick Start Guide (via amazon.com)</a> | <a href="http://www.lulu.com/product/download/vsphere-40-quick-start-guide/6169778">PDF (via lulu.com)</a>
Blogging: http://www.yellow-bricks.com | Twitter: http://www.twitter.com/DuncanYB
If anyone could help it would be appreciated! thanks.
Cheers,
Chad King
VCP-410 | Server+
If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
You are correct this is the latency per I/O. It means that large portion of your IOs have latency higher than 100 miliseconds, this is fairly high to be honest and I am not surprised if you would notice overall sluggishness. I wonder if you for instance have checked if your disks are aligned or not? This could contribute to this level of latency. I also wonder how many links you have going back to your filer?
Another thing worth investigating is the utilization of the processors of the filer.
Duncan
VMware Communities User Moderator | VCDX
-
Now available: <a href="http://www.amazon.com/gp/product/1439263450?ie=UTF8&tag=yellowbricks-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=1439263450">Paper - vSphere 4.0 Quick Start Guide (via amazon.com)</a> | <a href="http://www.lulu.com/product/download/vsphere-40-quick-start-guide/6169778">PDF (via lulu.com)</a>
Blogging: http://www.yellow-bricks.com | Twitter: http://www.twitter.com/DuncanYB
Thanks Duncan,
Sorry for getting back so very late. I have been very busy at work and have little time to do the VM troubleshooting. I supposed by link you mean possibly the amount of paths? I know we use two 4Gbps cards per host connected back to the SAN. This is also in our lab but I am glad to see that I am reading vSCSIstats correctly.
Apparently they added more memory to the VM and they are not having a problem anymore. I dont think this is what fixed their problem because when they get busy again I think it will occur again. Plus they also booted the VM. Either way I am going to do some further research in the lab and see what I can gather from others host.
Thanks again Duncan!
Cheers,
Chad King
VCP-410 | Server+
Twitter: http://twitter.com/cwjking
If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
Considering this is one VM in general. I supposed I could get a larger sample and see what its like. Do you have any other recommendations?
Cheers,
Chad King
VCP-410 | Server+
Twitter: http://twitter.com/cwjking
If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
They have two different types of storage. All are connect by FC but the one that is getting hammered is the SATA NetAPP storage they are using. It appears they have turned off deduplication because of the problems with I/O being high - though I would like more detail in this and would love read into it. They have a lot of VMs hitting that storage and moved them over to a SAS-FC storage and since have not had near as many issues. You would think after going through this hell in their VDI environment they would know this. Thanks again.
Cheers,
Chad King
VCP-410 | Server+
If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful