VMware Cloud Community
averylarry2
Contributor
Contributor

How does iops and bytes and block size all relate?

I'm setting up round-robin multipath with software iSCSI and it occurred to me that it would be a good idea to know how iops/block size/chunk size/stripe size all related (some of that obviously is from the SAN).

Is 1 iop always equal to 1 block based on the datastore blocksize?

Oh -- and changing the iops for multipathing still doesn't stick after a reboot . . ? ESXi 4

0 Kudos
2 Replies
jbWiseMo
Enthusiast
Enthusiast

Most of the time there is one iop per block read or written except when:

  • The situation requires I/O of less than a block, this is
    especially common on large block file systems such as VMFS.

  • A
    guest issues a virtual disk read/write smaller than a block on the host
    disk datastore, this is very common.

  • The block size is larger than the largest iop that ESX or ESXi thinks it can issue in one request. This limit obviously depends on the version of ESX.

  • A lucky coincidence allows ESX to read or write multiple blocks in one
    iop. This would only occur if the block size is smaller than the
    largest iSCSI iop supported by the version of ESX or ESXi, and only if
    the ESX or ESXi version contains code to do this.

RAID chunk size is about the I/O size after the raid controller cache on your SAN device has had a chance to combine multiple writes into a single logical write.

  • For all RAID levels, there is a controller performance overhead proportional to the number of chunks the combined request needs to be split into. So depending on the efficiency of the RAID controller in your SAN, there may be a penalty for too small a chunk size.

  • For RAID 3,4,5 and 6 If this combined write corresponds to an aligned "raid block" of (chunk size) times (number of drives in array not counting parity, mirror and spare drive), then the RAID controller can perform that write more efficiently because it does not need to read back unchanged data blocks or old parity blocks from the physical drives in order to compute the new parity values.

  • For RAID 0, the chunk size determines if a single combined read or write will occur all on one disk or will be distributed evenly over multiple disks, so in theory a small value should be best, except for the splitting overhead mentioned above.

  • For RAID 1 and JBOD, the chunk size shouldn't really matter, but some RAID controllers perform unnecessary I/O based on RAID 3/4/5/6 assumptions anyway, in which case the rules above apply anyway.

  • For combined RAID levels such as 10, 50, 60 etc. you should probably apply the formulas for RAID 3/4/5/6 above

On any iSCSI device, the requests are also split into multiple ethernet packets for transmission. This overhead can be reduced if both the SAN box, any physical network devices (such as switches) and the ESX network stack are all configured to use "jumbo frames" larger than the default 1500 bytes per packet. I have not yet figured out how to do this for ESX/ESXi, but it is probably either trivial or impossible.

0 Kudos
J1mbo
Virtuoso
Virtuoso

Re IOPS sticking between host reboots: http://www.techhead.co.uk/dell-equallogic-ps4000-hands-on-review-part-2 (scroll down to 'round robin IOPS')

Re storage benchmarking in general: http://blog.peacon.co.uk/benchmarking-storage-for-vmware/

http://blog.peacon.co.uk

Please award points to any useful answer.

0 Kudos