Skip navigation
2013

Storage I/O trends

Server, storage and IO metrics that matter need context

There is an old saying that the best I/O (Input/Output) is the one that you do not have to do.

In the meantime, let’s get a side of some context with  them IOPS from  vendors, marketers and their pundits who are tossing them around for server,  storage and IO metrics  that matter.

 

Expanding the conversation, the need for more context

The good news is that people are beginning to discuss  storage beyond space capacity and cost per GByte, TByte or PByte for both DRAM  or nand flash Solid  State Devices (SSD), Hard  Disk Drives (HDD)  along with Hybrid  HDD (HHDD)  and Solid State Hybrid  Drive (SSHD) based solutions. This applies to traditional enterprise or SMB IT  data center with physical, virtual or cloud based infrastructures.

hdd and ssd iops

 

This is good because it expands the conversation beyond just  cost for space capacity into other aspects including performance (IOPS, latency,  bandwidth) for various workload scenarios along with availability, energy  effective and management.

Adding a side of context

The catch is that IOPS while part of the equation are  just one aspect of performance and by themselves without context, may have  little meaning if not misleading in some situations.

 

Granted it can be entertaining, fun to talk about or  simply make good press copy for a million IOPS. IOPS vary in size depending on  the type of work being done, not to mention reads or writes, random and  sequential which also have a bearing on data throughout or bandwidth (Mbytes  per second) along with response time.

 

However, are those million IOP's applicable to your  environment or needs?

 

Likewise, what do those million or more IOPS represent about type of work being done? For example, are they small 64 byte or large  64 Kbyte sized, random or sequential, cached reads or lazy writes (deferred or  buffered) on a SSD or HDD?

 

How about the response time or latency for achieving them  IOPS?

 

In other words, what is the context of those metrics and why  do they matter?

 

storage i/o iops
Click on image to view more metrics that matter including IOP's for HDD and SSD's

 

Metrics that matter give context for example IO sizes  closer to what your real needs are, reads and writes, mixed workloads, random  or sequential, sustained or bursty, in other words, real world reflective.

 

As with any benchmark take  them with a grain (or more) of salt, they key is use them as an indicator then  align to your needs. The tool or technology should work for you, not the other  way around.

 

Here are some examples of context that can be added to  help make IOP's and other metrics matter:

  • What is the IOP size, are they 512 byte (or  smaller) vs. 4K bytes (or larger)?
  • Are they reads, writes, random, sequential or  mixed and what percentage?
  • How was the storage configured including RAID,  replication, erasure or dispersal codes?
  • Then there is the latency or response time and  IO queue depths for the given number of IOPS.
  • Let us not forget if the storage systems (and  servers) were busy with other work or not.
  • If there is a cost per IOP, is that list price  or discount (hint, if discount start negotiations from there)
  • What was the number of threads or workers, along  with how many servers?
  • What tool was used, its configuration, as well  as raw or cooked (aka file system) IO?
  • Was the IOP's number with one worker or multiple  workers on a single or multiple servers?
  • Did the IOP's number come from a single storage  system or total of multiple systems?
  • Fast storage needs fast serves and networks,  what was their configuration?
  • Was the performance a short burst, or long  sustained period?
  • What was the size of the test data used; did it  all fit into cache?
  • Were short  stroking for IOPS or long stroking for bandwidth techniques used?
  • Data  footprint reduction (DFR)  techniques (thin provisioned, compression or dedupe) used?
  • Were write data committed synchronously to  storage, or deferred (aka lazy writes used)?

 

The above are just a sampling and not all may be relevant  to your particular needs, however they help to put IOP's into more contexts.  Another consideration around IOPS are the configuration of the environment,  from an actual running application using some measurement tool, or are they  generated from a workload tool such as IOmeter, IOrate,  VDbench among others.

 

Sure, there are more contexts and information that would  be interesting as well, however learning to walk before running will help  prevent falling down.

 

Storage I/O trends

Does size or age of vendors make a difference when it  comes to context?

Some vendors are doing a good job of going for out of  this world record-setting marketing hero numbers.

 

Meanwhile other vendors are doing a good job of adding  context to their IOP or response time or bandwidth among other metrics that  matter. There is a mix of startup and established that give context with  their IOP's or other metrics, likewise size or age does not seem to matter for  those who lack context.

 

Some vendors may not offer metrics or information  publicly, so fine, go under NDA to learn more and see if the results are  applicable to your environments.

 

Likewise, if they do not want to provide the context,  then ask some tough yet fair questions to decide if their solution is applicable  for your needs.

 

Storage I/O trends

Putting this all into context

What this means is let us start putting and asking for metrics  that matter such as IOP's with context.

 

If you have a great IOP metric, if you want it to matter  than include some context such as what size (e.g. 4K, 8K, 16K, 32K, etc.),  percentage of reads vs. writes, latency or response time, random or sequential.

 

IMHO the most interesting or applicable metrics that  matter are those relevant to your environment and application. For example if  your main application that needs SSD does about 75% reads (random) and 25%  writes (sequential) with an average size of 32K, while fun to hear about, how  relevant is a million 64 byte read IOPS? Likewise when looking at IOPS, pay  attention to the latency, particular if SSD or performance is your main  concern.

 

Get in the habit of asking or telling vendors or their  surrogates to provide some context with them metrics if you want them to  matter.

 

So how about some context around them IOP's (or latency  and bandwidth or availability for that matter)?

 

Ok, nuff said (for now).

Cheers gs

Storage I/O trends

HDDs for cloud, virtual and traditional storage  environments

This is a follow-up to a recent  series of posts on Hard  Disk Drives (HDD's)  along with some posts about How Many IOPS HDD's can  do.

 

HDD and storage  trends and directions include among others

HDD's will continue to be declared dead into the next  decade, just as they have been for over a decade, meanwhile they are being  enhanced, continued to be used in evolving roles.

hdd and ssd

SSD will continue to coexist with HDD, either as separate or converged HHDD's. Where,  where and how they are used will also continue to evolve. High IO (IOPS) or low  latency activity will continue to move to some form of nand flash SSD (PCM  around the corner), while storage capacity including some of which has been  on tape stays on disk. Instead of more HDD capacity in a server, it moves to a  SAN or NAS or to a cloud or service provider. This includes for backup/restore,  BC, DR, archive and online reference or what some call active archives.

The need for storage spindle speed and more

The need for faster revolutions per minute (RPM's)  performance of drives (e.g. platter spin speed) is being replaced by SSD and  more robust smaller form factor (SFF) drives. For example, some of today’s 2.5”  SFF 10,000 RPM (e.g. 10K) SAS HDD's can do as well or better than their  larger 3.5” 15K predecessors can for both IOPS and bandwidth.  This is also an example where the RPM speed of a drive may not be the only determination  for performance as it has been in the past.


Performance comparison of four different drive types, click  to view larger image.

The need for storage space capacity and areal density

In terms of storage  enhancements, watch for the appearance of Shingled  Magnetic Recording (SMR)  enabled HDD's to help further boost the space capacity in the same footprint.  Using SMR HDD manufactures can put more bits (e.g. areal density) into the  same physical space on a platter.


  Traditional vs. SMR to increase storage areal density capacity

 

The generic idea with SMR is to increase areal density  (how many bits can be safely stored per square inch) of data placed on spinning  disk platter media. In the above image on the left is a representative example  of how traditional magnetic disk media lays down tracks next to each other.  With traditional magnetic recording approaches, the tracks are placed as close  together as possible for the write heads to safely write data.

 

With new recording formats such as SMR along with  improvements to read/write heads, the tracks can be more closely grouped  together in an overlapping way. This overlapping way (used in a generic  sense) is like how the shingles on a roof overlap, hence Shingled  Magnetic Recording. Other magnetic recording or storage  enhancements in the works include Heat Assisted Magnetic Recording (HAMR)  and Helium filed drives. Thus, there is still plenty of bits and bytes room for  growth in HDD's well into the next decade to co-exist and complement SSD's.

DIF and AF (Advanced Format), or software defining the drives

Another evolving storage feature that ties into HDD's is Data Integrity  Feature (DIF)  that has a couple of different types. Depending on which type of DIF (0, 1, 2, and  3) is used; there can be added data integrity checks from the application  to the storage medium or drive beyond normal functionality. Here is something  to keep in mind, as there are different types or levels of DIF, when somebody  says they support or need DIF, ask them which type or level as well as why.

 

Are you familiar with Advanced Format (AF)? If not you  should be. Traditionally outside of special formats for some operating systems  or controllers, that standard open system data storage block, page or sector  has been 512 bytes. This has served well in the past however; with the advent  of TByte and larger sized drives, a new mechanism is needed. The need is to  support both larger average data allocation sizes from operating systems and  storage systems, as well as to cut the overhead of managing all the small  sectors. Operating systems and file systems have added new partitioning  features such as GUID  Partition Table (GPT)  to support 1TB and larger SSD, HDD and storage system LUN's.

 

These enhancements are enabling larger devices to be used  in place of traditional Master Boot Record (MBR) or other operating system partition  and allocation schemes. The next step however is to teach the operating  systems, file systems and hypervisors along with their associated tools or  drives how to work with 4,096 byte or 4 Kbyte sectors. The advantage will be to  cut the overhead of tracking all of those smaller sectors or file system  extents and clusters. Today many HDD's support AF however by default may have 512-byte  emulation mode enabled due to lack of operating system or other support.

Intelligent Power Management, moving beyond drive spin down

Intelligent Power Management (IPM) is a collection of  techniques that can be applied to vary the amount of energy consumed by a  drive, controller or processor to do its work. These include in the case of a  HDD slowing the spin rate of platters, however keep in mind that mass in motion  tends to stay in motion. This means that HDD's once up and spinning do not  need as much relative power as they function like a flywheel. Where their  power draw comes in is during reading and write, in part to the movement of  read/write heads, however also for running the processors and electronics that  control the device. Another big power consumer is when drives spin up, thus if  they can be kept moving, however at a lower rate, along with disabling energy  used by read/write heads and their electronics, you can see a drop in power  consumption. Btw, a  current generation 3.5” 4TB 6Gbs SATA HDD consumes about 6-7 watts of power  while in active use, or less when in idle mode. Likewise a  current generation high performance 2.5” 1.2TB HDD consumes about 4.8 watts  of energy, a far cry from the 12-16 plus watts of energy some use as HDD fud.

Hybrid Hard Disk Drives (HHDD) and Solid State Hybrid  Drives (SSDHD)

Hybrid  HDD's (HHDD's)  also known as Solid State Hybrid Drives (SSHD) have been around for a while and  if you have read my  earlier posts, you know that I have been a user and fan of them for  several years. However one of the drawbacks of the HHDD's has been lack of write  acceleration, (e.g. they only optimize for reads) with some models. Current and  emerging HDDD's are appearing with a mix of nand flash SLC (used in earlier versions), MLC and eMLC along with DRAM while enabling write optimization. There are also more drive options  available as HHDD's from different manufactures both for desktop and enterprise class  scenarios.

 

The challenge with HHDD's is that many vendors either do  not understand how they fit and compliment their tiering or storage management  software tools, or simply do not see the value proposition. I have had vendors  and others tell me that the HHDD's don’t make sense as they are too simple, how  can they be a fit without requiring tiering software, controllers, SSD and HDD's  to be viable?

 

Storage I/O trends

 

I also see a trend similar to when the desktop high-capacity SATA drives appeared for enterprise class storage systems in the early  2000s. Some of the same people did not see where or how a desktop class product  or technology could ever be used in an enterprise solution.

 

Hmm, hey wait a minute, I seem to recall similar thinking  when SCSI drives appeared in the early 90s, funny how some things do not  change, DejaVu anybody?

 

Does that mean HHDD's will be used everywhere?

Not necessarily,  however there will be places where they make sense, others where either a HDD  or SSD will be more practical.

Networking with your server and storage

Drive native interfaces near-term will remain as 6Gbs  (going to 12Gbs) SAS and SATA with some FC (you might still find a parallel SCSI drive out there). Likewise,  with bridges or interface cards, those drives may appear as USB or something  else.

 

What about SCSI  over PCIe, will that catch on as a drive interface? Tough to say however I  am sure we can find some people who will gladly try to convince you of that. FC  based drives operating at 4Gbs FC (4GFC) are still being used for some environments  however most activity is shifting over to SAS and SATA. SAS and SATA are  switching over from 3Gbs to 6Gbs with 12Gbs SAS on the roadmap's.

So which drive is best for you?

That depends; do you need bandwidth or IOPS, low latency or  high capacity, small low profile thin form factor or feature functions? Do  you need a hybrid or all SSD or a self-encrypting  device (SED)  also known as Instant  Secure Erase (ISE),  these are among your various options.

Disk drives

Why the storage diversity?

 

Simple, some are legacy soon to be replaced and disposed  of while others are newer. I also have a collection so to speak that get used  for various testing, research, learning and trying things out. Click here and here to read  about some of the ways I use various drives in my VMware environment including creating  Raw Device Mapped (RDM) local SAS and SATA devices.

 

Other capabilities and functionality existing or being  added to HDD's include RAID and data copy assist; secure erase, self-encrypting,  vibration dampening among other abilities for supporting dense data  environments.

Wrapup, for now

Do not judge a drive by its interface, space capacity,  cost or RPM alone. Look under the cover a bit to see what is inside in terms of  functionality, performance, and reliability among other options to fit your  needs. After all, in the data center or information factory not everything is the same.

 

From a marketing and fun to talk about new technology perspective,  HDD's might be dead for some. The reality is that they are very much alive in physical,  virtual and cloud environments, granted their role is changing.

 

Ok, nuff said (for now).

Cheers gs