VMware Cloud Community
Groundbeef79
Enthusiast
Enthusiast
Jump to solution

To defrag or not?

We have a particular server that hosts several Synergex databases.  There are also a number of flat type files that get deleted/copied/added daily.  According to Windows Server 2008 R2, the disk is very fragmented.  This server exists on an Equallogic PS6000XV SAN.  It is part of RAID 50 where it has it's own dedicated volume.  The VMDK is thick-provisioned.

When this was a physical server and the data was hosted on server 2003 with a local RAID-5, we had Diskeeper installed and would run a defrag weekly.  Would this be a good idea now?  In looking through the forums, I've seen arguments both ways.  Most are pretty dated, so I was wondering what the current concensus was.  I see three ways to accomplish this:

1.  Defrag the SAN, can that even be done?  Does anyone know if Equallogic has some tool for this?

2.  Defrag the VMDK?  Again, is this like voodoo?

3.  Defrag the file system within the VMDK - using either built-in disk tools or something like Diskeeper?

or

4.  Just leave well enough alone because Windows doesn't know what it's talking about.

When it was physical, we saw noticeable performance increases after the first defrag.  In either case, this thing runs way better in a virtual environment with less resources than it did as a physical server.  The virtual stuff never ceases to amaze me.

0 Kudos
1 Solution

Accepted Solutions
J1mbo
Virtuoso
Virtuoso
Jump to solution

We have a particular server that  hosts several Synergex databases...This server exists on an  Equallogic PS6000XV...part of RAID 50 where it has it's own  dedicated volume.  The VMDK is thick-provisioned.

The expected outcome of the defrag is what needs to be considered - as Andre says, there is no certainty over what the EqualLogic is doing with blocks and in any case, even if there were, since the volume is spread across 14 spindles, what exactly is a sequential access anyway?  Does the EqualLogic provide storage for anything else?  Most databases tend to show a random IO workload as they gather (and scatter) data across the tables so how it is organised physically just doesn't really matter.
The data placement of the EqualLogic system is quite clever - for example, if in future an SSD array were to be added, the system could migrate hot blocks to that array.
The replication is also incredibly important.  IIRC, EqualLogic use change block tracking with a 64MB block size.  So at the end of replication interval, any 64MB blocks that have been touched by a write will be transmitted across the WAN - clearly defrag will touch a massive number of blocks and potentially make this unworkable.
There are also a number of flat type  files that get  deleted/copied/added daily.
Along the same lines, this might need some thought.  For example you might consider running such temporary space in folders set with NTFS compression enabled; if these are text files this can be highly effective at reducing disk space requirements (and hence the number of blocks touched for WAN replication).   Another option could be to run these to/from temporary space provided by a file server on a non-replicated volume.
Hope that helps!

View solution in original post

0 Kudos
9 Replies
DSTAVERT
Immortal
Immortal
Jump to solution

I would defrag the disk from within Windows. Since the disk is already thick it won't grow like a thin disk will.

-- David -- VMware Communities Moderator
0 Kudos
vmroyale
Immortal
Immortal
Jump to solution

Hello.

The best "general" advice I have heard on this is to do some performance testing before and after.  These numbers should tell you whether or not it is worth your time.

Good Luck!

Brian Atkinson | vExpert | VMTN Moderator | Author of "VCP5-DCV VMware Certified Professional-Data Center Virtualization on vSphere 5.5 Study Guide: VCP-550" | @vmroyale | http://vmroyale.com
0 Kudos
Groundbeef79
Enthusiast
Enthusiast
Jump to solution

Thanks for the responses guys.

After some further Googling, I ran across another post that has me a bit worried.

http://communities.vmware.com/message/1720812

Sketchy00's comment "Also worth mentioning is that if you do any SAN to SAN replication, disk defrags will make the number of changed blocks skyrocket."

While we aren't doing that right now, we will be replicating to another Equallogic SAN offsite in the next 6 months.  I'm no disk guru, so forgive my ignorance.  Does there exist such a solution that doesn't mess with the changed blocks?  I'd hate to have to replicate 100 gigs of data where there's really only 10 that may have changed.

Maybe I should just talk to Diskeeper or some other vendor?

0 Kudos
GlenS
Contributor
Contributor
Jump to solution

Hi Groundbeef79

Defrag Windows if it is saying that it needs it.

But I'd use a 3rd party defrag program such as mydefrag http://www.mydefrag.com/ to do it.  Works much better than the one that Windows has built in.

Hope it works!

Regards

Glen

0 Kudos
AndreTheGiant
Immortal
Immortal
Jump to solution

Defrag on a SAN (but in some case also on a NAS) could not make sense.

You do not really know how the storage handle the blocks, and some storages can optimize the allocation themself.

On a local storage this could be different.

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
J1mbo
Virtuoso
Virtuoso
Jump to solution

We have a particular server that  hosts several Synergex databases...This server exists on an  Equallogic PS6000XV...part of RAID 50 where it has it's own  dedicated volume.  The VMDK is thick-provisioned.

The expected outcome of the defrag is what needs to be considered - as Andre says, there is no certainty over what the EqualLogic is doing with blocks and in any case, even if there were, since the volume is spread across 14 spindles, what exactly is a sequential access anyway?  Does the EqualLogic provide storage for anything else?  Most databases tend to show a random IO workload as they gather (and scatter) data across the tables so how it is organised physically just doesn't really matter.
The data placement of the EqualLogic system is quite clever - for example, if in future an SSD array were to be added, the system could migrate hot blocks to that array.
The replication is also incredibly important.  IIRC, EqualLogic use change block tracking with a 64MB block size.  So at the end of replication interval, any 64MB blocks that have been touched by a write will be transmitted across the WAN - clearly defrag will touch a massive number of blocks and potentially make this unworkable.
There are also a number of flat type  files that get  deleted/copied/added daily.
Along the same lines, this might need some thought.  For example you might consider running such temporary space in folders set with NTFS compression enabled; if these are text files this can be highly effective at reducing disk space requirements (and hence the number of blocks touched for WAN replication).   Another option could be to run these to/from temporary space provided by a file server on a non-replicated volume.
Hope that helps!
0 Kudos
Groundbeef79
Enthusiast
Enthusiast
Jump to solution

That's about the best explanation I've seen on the subject.   Do you work for Equallogic? Smiley Happy I was just wondering if it was even worth looking into and it sounds like I ought to just let Equallogic do it's thing.  Performance isn't an issue right now, but there was some question about the merit of doing the defrag.  I still wonder though if it would be worthwhile to do the Windows defrag then?

Along the same lines, this might need some thought.  For example you might consider running such temporary space in folders set with NTFS compression enabled; if these are text files this can be highly effective at reducing disk space requirements (and hence the number of blocks touched for WAN replication).   Another option could be to run these to/from temporary space provided by a file server on a non-replicated volume.

Unfortunately, this is out of our hands as this is a vendor tailored software solution.  I won't say what it is for fear of retribution, but it is very poorly written.  The other sad part is we are pretty much married to it.

Our storage guy at Dell has been trying to sell us on Certeon WAN acceleration for our Equallogic replication project.  I suppose with the 64 MB chunks you're talking about that makes total sense.

0 Kudos
Josh26
Virtuoso
Virtuoso
Jump to solution

VMware aside, it's never made sense to defrag a SAN.

0 Kudos
rickardnobel
Champion
Champion
Jump to solution

Josh26 wrote:

VMware aside, it's never made sense to defrag a SAN.

Why would it never make sense? It seems as if the Windows NTFS fragmentation is at 4KB blocks and the SAN RAID stripe size is at 128 KB perhaps, then it would be an advantage for the SAN if the disk IO could hit one (or few) disks and not has to be spread over all (or many) spindles?

My VMware blog: www.rickardnobel.se
0 Kudos