VMware Cloud Community
BTB2809
Contributor
Contributor

vSAN Disk Latency

We just setup a 3 node cluster to pilot vSAN for an upcoming project.  We are seeing disk latency spike to 70ms routinely with no real io load on the box and we see it spike to 300ms when we place an io load of about 500 iops.

We are using 3 Dell R720s each has 7 x 280GB HDD and 1 x 179 GB SSD.  We are using dedicated 10Gbs network cards connected to Cisco Nexxus switches for the vSAN connectivity.

Anyone else ran into latency issue / found a solution?

vsan.jpg

Tags (2)
Reply
0 Kudos
10 Replies
vThinkBeyondVM
VMware Employee
VMware Employee

Hi Friend,

What is the disk controller do you use?

Below blogs gives insight on queue depth across different vendors from disk controller perspective.

http://www.yellow-bricks.com/2014/04/17/disk-controller-features-and-queue-depth/


----------------------------------------------------------------
Thanks & Regards
Vikas, VCP70, MCTS on AD, SCJP6.0, VCF, vSphere with Tanzu specialist.
https://vThinkBeyondVM.com/about
-----------------------------------------------------------------
Disclaimer: Any views or opinions expressed here are strictly my own. I am solely responsible for all content published here. Content published here is not read, reviewed or approved in advance by VMware and does not necessarily represent or reflect the views or opinions of VMware.

Reply
0 Kudos
BTB2809
Contributor
Contributor

Its a PERC H310

Reply
0 Kudos
cmiller78
Enthusiast
Enthusiast

Are the drives SATA?

I'm running a lab on a C6105 w/ whatever terrible SATA controller comes with it and SATAII drives. I have a queue depth of 31 for the entire onboard HBA, and none of the gear is on the HCL.

However specs may be similar to yours. I'm nowhere near fully utilizing the NICs (and I'm only running 1GbE) and I'm seeing the same as you - latency spikes when I exceed 500 IOPS. I've looked at everything and I'm fairly sure I'm saturating the controllers.

I think you should look at upgrading your HBA. Keep in mind SATA drives with native command queuing only have a queue depth of 32 per drive as well.

Reply
0 Kudos
depping
Leadership
Leadership

You should open up esxtop and look at the queues of your devices to see if those are filling up... I think it is save to assume that that is your problem.

Reply
0 Kudos
BTB2809
Contributor
Contributor

We ended up trying the PERC 710P which improved the performance but we still saw the latency spikes.  Unfortunately the SSDs were still hitting queue saturation.   VMware support felt that the saturation periods were too brief to cause concern but we didn't have time to continue testing. our project timeline was at risk so we ended up using a VNXe instead... so no happy endings to this story.  I assume we will see some good reference architectures in the coming year and we can try again on a future project.  We had this in mind for our remote office use cases.

Reply
0 Kudos
admin
Immortal
Immortal

@BTB2809 - did you create an SR? If so, can you share the SR #?

Thanks.
Kiran

Reply
0 Kudos
JohnNicholson
Enthusiast
Enthusiast

If you don't care about Dell support you can flash this with an updated LSI IT mode 2008 flash and it will increase the queue depth to 600.

I saw the same problems in my lab testing with my LSI 2008 based PIKE cards until we updated them.

LSI 2008 Dell H310 VSAN rebuild performance concerns - Virtual Ramblings

Reply
0 Kudos
JohnNicholson
Enthusiast
Enthusiast

For some reason half of Dell's reference VSAN nodes use this worthless controller.  Since their sales people are telling people that VMware will not support production use of VSAN I suspect its some misguided desire to convince people to not use it...

Reply
0 Kudos
cmiller78
Enthusiast
Enthusiast

Reference configs seem to be out of date already.

Dell has both the Perc H710 and H710P on the HCL and both support a queue depth of 975

As for your last comment, I doubt Dell is highly motivated to create a strong reference config that will ultimately cannibalize their EqualLogic and Compellant revenue so you may be correct....

Reply
0 Kudos
joergriether
Hot Shot
Hot Shot

Staring with July 2014 the VSAN-HCL was modified. All controllers with QD < 256 were removed, also the H310 / LSI2004/2008.

VMware KB: Storage Controllers previously supported for VSAN that are no longer supported

Best regards,
Joerg

Reply
0 Kudos