VMware Cloud Community

Guest Disk Defragmentation Topic again - Windows Logical NTFS Fragmentation Woes


This question is specific to SAN users with SSD, tiering and thin provisioning in place like lots of enterprises.

Should you run disk defragmentation software on your guest virtual machines.  The old rule of thumb was no.  It causes issues with thin provisioning, cbt, backup deduplication, auto tiering, excessive SSD writes, excessive I/O operations.

But, with all that being side, the below two articles make a strong case that a heavily "logically" fragmented file system will also cause excessive I/O operations and potentially excessive SSD writes as well.



So the question is...  Will running a disk defragmenter to minimize "logical" NTFS fragmentation be beneficial?  When does it become beneficial?  How often would you run it?  And finally if you were going to use a logically defragmentation tool, would you use Diskeeper or Perfect Disk?

It seems to be that it would be a safe assumption to run a defragmentation tool 1-2 per year per virtual machine.  I think anything more frequent is probalby overkill.  In our case with SSD and auto tiering and the potential additional backup work and deduplication space, we would need to schedule this out so we are only defraging 1/6 or 1/12th of the virtual machines per month.

I have read many articles giving compelling reasons not to run a defragmentation utility, but I have yet to see one of those articles address the issue of NTFS logical fragmentation creating additional I/O requests on your SAN infrastructure.  Is this issue of additional I/O requests more problematic with iSCSI than fiber channel?

We are leaning towards testing some products and considering implementation although we are not quite sure yet how to do a good case study on the affects.  I would love to hear feedback and other opinions on this topic.

One final note of reference, we attempted to validate this with our SAN provider and the above articles went over the head of the engineers we were talking with.  They continued to say that the SAN does its own physical defragmentation so you don't need to defragment.  When I presented the above reference articles, they went into depth of how the SAN defragments but ignored the topic of logically NTFS fragmentation and additional I/O operations caused from logical fragmentation.

2 Replies

I have read those articles, and while both are quite correct in arguments, not one of them takes SSD into account (and as you wrote, "...This question is specific to SAN users with SSD...").

For such a scenario I can tell you (de)fragmentation does not make any sense with SSD, similar as there is no "seek time" in traditional sense (depending on physical location of sectors/cylinders). And even if you "defragmented" your filesystem and OS showed you nice picture of defragmented disk with no free space between allocated sectors, it does not mean blocks would be continuous on SSD. Contrariwise, after "defragmentation" your SSD could be even more "fragmented" then before.

SSD-controller is impassable abstraction layer, so OS does not get true info about underlying SSD geometry (at least not with standard ata/scsi-protocol). Moreover, any write-operation is much more complex, than with traditional disks, and can consist of multiple read/write-ops on larger chunks of data. If OS does not change file, it is supposed the file remains on its place, but this assumption is not valid with SSD. If the file shares common block with other file which has been changed, the whole block is read, re-calculated and written agan (and very probably to different memory-cell, due to wear-leveling mechanism).

Another problem is SAN (and raid-layer), where question is: if storage-server (or raid-controller) presents some allocated space to OS as "continuous" (or "fragmented"), is it trully continuous or fragmented on underlying hardware? I do not know all those NAS-protocols well enough to know the answer, but I suppose this might not be always true (unless that storage-space is mapped as raw/pass-through) or may be protocol-/vendor-dependent.

So to summarise it, I would say: with traditional disk-based storage it might be always worth to try defragmentation (and evaluate its impact on performance), but with SSD definitely not. It could actually do more harm than good...

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! 😉
0 Kudos

I think you missed the point as most people do when talking about "logical" fragmentation.

As I understand the topic, a logically fragmented NTFS file may require 2+ I/O operations to retrieve the same file from the SAN where as a non-fragmented file will only require 1 I/O operation.  As SAN and VMWare admins, our goal is to maximize IOPS and minimize latency, more I/O operations to retrieve the same file is bad.  The underlying SAN hardware SSD, physical disks, whatever doesn't matter.  This is application layer operations at the operating system.

My real question is how many additional I/O operations from fragmentation are acceptable before the additional I/O penalties are more of a problem than the wasted I/O operations, tiering, extra SSD writes, etc that occur from performing a defragmentation.

The answer might be never the cost of defragmenting is always higher than the cost of additional I/O operations from logical fragmentation, or it might be worthwhile to defragment on a weekly, monthly or less frequent basis.

And keep in mind, when I am referring to defragmentation, I am referring to virtualized aware defragmentation that "supposedly" minimizes the amount of underlying physical block moves to perform a logical defragmentation.