VMware Cloud Community
SanderHogewerf
Enthusiast
Enthusiast
Jump to solution

Random Writes low Throughput

Hi,

We just implemented a new vSan cluster (6.7) based on vxrail. It is an all flash cluster with NVME cache disks.

Now i see a Random Read of 4 times as high as the Random Write speed.

My first question is, is this normal?

Second question, are there some standard steps to increase the Random Write Throughput?

I'd like to hear of you and thank you!

Tags (2)
1 Solution

Accepted Solutions
srodenburg
Expert
Expert
Jump to solution

Not persé but adding stripes is counter productive, especially on all-flash. Normally, in a Mirror, data needs to be fetched from only two capacity devices. One on say, node 3 disk 2 and the other mirror copy lies on node 6 disk 4 (just an example). So the data needs only to be written and read to and from 2 devices.

If you start striping, say a stripe-width of 2, that means that each mirror copy is now actually made up of 2 parts so 4 parts in total. Now, reads and writes are done from 4 devices. This adds latency. The higher the stripe-width, the higher the latency becomes.

Striping has only 1 use-case and that is on slow rotational drives where sequential performance (streaming etc.) is important and latency is not.

In other words, you just made the latency worse for yourself 😉

Also, from a pure data point of view, it sounds like you have absolutely no use case for a stretched cluster if both sites keep their data to themselves anyway. Why not have 2 regular clusters. Much simpler, no messing about with a Witness Appliance etc.

In case of an emergency, you don't have the time and means of moving all the data to the other side so quickly anyways because you never replicated a thing to the other site.

Or is it just a small number of VM's which are, policy-wise, not stretched but all other VM's are mirrored in both sites?

My advice:  stripe-width of 1 as Flash is so fast, you gain nothing with striping. It just slows everything down as data needs to be written to and fetched from more devices than it needs to.

View solution in original post

9 Replies
srodenburg
Expert
Expert
Jump to solution

Hi Sander,

You are saying "Random Read of 4 times as high as the Random Write speed" but also asking "are there some standard steps to increase the Random Write Throughput?"

That does not add up. With your first question, did you mean to ask why writes are 4 times as high as reads?

Are you running a normal cluster or a stretched cluster?

Is read-locality active or not?

Reply
0 Kudos
SanderHogewerf
Enthusiast
Enthusiast
Jump to solution

Hi,

The reads are 4 times higher than the Writes, not the other way around.

It is a stretched cluster.

pastedImage_0.png

For example the above storage policy is used.

Reply
0 Kudos
TheBobkin
Champion
Champion
Jump to solution

Hello SanderHogewerf​,

Are you perchance looking at VMs registered and running on Non-Preferred/Secondary site where their data resides ONLY on Preferred site?

Bob

Reply
0 Kudos
SanderHogewerf
Enthusiast
Enthusiast
Jump to solution

Hi Bob,

No the VM is also on the preffered site, just as the data is.

Sander

Reply
0 Kudos
srodenburg
Expert
Expert
Jump to solution

Then it's simple. It's a stretched cluster and assuming read-locality is active (it should be), write-IO's have to go to both sites while read-IO's only come from the site the VM is running in.

If all is setup correctly. I seriously wonder if your setup is because why would you use a stretched cluster, all the while data is stored only in 1 site??

So assuming i'm right, your Storage policy is all wrong. You are not mirroring over both sites now (Site disaster tolerance = none). You are only mirroring locally per site now.  It should be at least the other way around:

Site disaster tolerance = "Dual Site Mirroring" meaning Geo protection.

Failures to tolerate = None (for "Geo Only" aka Stretched cluster mirroring only) or 1 Failure (for Geo **AND** local mirroring).

In the case of Geo mirroring **AND** local mirroring, 100GB worth of data will be stored 4 times. 2x in each site so be aware of the storage costs. Such policies should be driven by application workload availability requirements.

Fix your policy first. Make a new one and apply them to 1 VM at the time (or at least not all of them at the same time as that is IO suicide). After all VM's are correctly protected, re-evaluate. But what you will always see is that writes are much slower than reads because writes have to be written in both sites and reads are local, thus much much faster.

After all is done, make the new policy the default policy for the datastore so that mistakes cannot happen again.

Reply
0 Kudos
depping
Leadership
Leadership
Jump to solution

Also, keep in mind that vSAN has a per host local memory cache of 1GB, so depending on the workload, it could be you are hitting the cache a lot.

Reply
0 Kudos
SanderHogewerf
Enthusiast
Enthusiast
Jump to solution

Hi srodenburg

Thanks for your answer! I understand it.

To give some more information and why the storage policy is chosen this way.

The server is part of a cluster which failsover in the application. So the stretched component is not needed with this server, this is why i keep the data in one site.

I increased the striping in of the policy and than i saw the writes kick up a bit. Does this say that i'm on the limit of the diskgroup?

Reply
0 Kudos
srodenburg
Expert
Expert
Jump to solution

Not persé but adding stripes is counter productive, especially on all-flash. Normally, in a Mirror, data needs to be fetched from only two capacity devices. One on say, node 3 disk 2 and the other mirror copy lies on node 6 disk 4 (just an example). So the data needs only to be written and read to and from 2 devices.

If you start striping, say a stripe-width of 2, that means that each mirror copy is now actually made up of 2 parts so 4 parts in total. Now, reads and writes are done from 4 devices. This adds latency. The higher the stripe-width, the higher the latency becomes.

Striping has only 1 use-case and that is on slow rotational drives where sequential performance (streaming etc.) is important and latency is not.

In other words, you just made the latency worse for yourself 😉

Also, from a pure data point of view, it sounds like you have absolutely no use case for a stretched cluster if both sites keep their data to themselves anyway. Why not have 2 regular clusters. Much simpler, no messing about with a Witness Appliance etc.

In case of an emergency, you don't have the time and means of moving all the data to the other side so quickly anyways because you never replicated a thing to the other site.

Or is it just a small number of VM's which are, policy-wise, not stretched but all other VM's are mirrored in both sites?

My advice:  stripe-width of 1 as Flash is so fast, you gain nothing with striping. It just slows everything down as data needs to be written to and fetched from more devices than it needs to.

SanderHogewerf
Enthusiast
Enthusiast
Jump to solution

Thank you very much for your answers! I now know why the writes are that much slower than the reads.

The amount of machines which do have a failover at application level is not a great number.

For now thank you!

Reply
0 Kudos