VMware Cloud Community
XxTRAINEExX
Contributor
Contributor

Defrag an ESX Guest?

I've been reading multiple threads on this topic but I think I am missing something here. There are several threads encouraging people to defrag their ESX guests. How does this improve performance?

My understanding of DEFRAG is that it moves blocks around on your disk in order to increase performance. ie. My file sits in 9 blocks sitting across the disk in random locations. Defrag would move these 9 blocks to create 9 contiguous blocks on the spindle. This allows the disk arm to move less, thus improving performance on the disk.

On ESX this is highly abstracted from the guest. The guest thinks the files are sitting all over the disk. So it moves the blocks all around to increase performance. The problem is that the guests is mapping data to blocks on a virtual scsi disk (vmdk). This has nothing to do with the actual block placement on your SAN. There is no way for the virtual guest to see contiguous blocks on physical spindles sitting on a san is there? If not, isn't defrag within a guest pointless?

Reply
0 Kudos
55 Replies
paulo_meireles

I agree with XxTRAINEExX that the VM has no way of knowing anything about the physical location of data blocks, so defragmentation shouldn't improve performance due to block placement. However, I also agree with RParker: defragmentation does, indeed, improve performance.

I think there's another factor, besides block placement, that all have been missing: NTFS metadata. I'm going to assume that we're all talking about Windows machines and NTFS filesystems; correct me if I'm wrong. Files on NTFS are referenced, in metadata (the Master File Table), by a linked list of records. Every record stores the starting cluster, the length (in clusters), and the address of the next block on metadata. So, a 10-cluster contiguous file has a single reference on metadata: starts on cluster X, occupies 10 clusters, and there is no "next fragment". If we append 2 clusters to that file andit fragments, we'll now have another record on the MFT.I think you got the idea.

A heavily fragmented file will be made of a long list of MFT records, and processing that list not only involves more I/O (reading several MFT records instead of just one) but also takes more CPU. So, this is where I think most performance improvements come: NTFS metadata optimization, by reducing the length of the linked list that records where the several file fragments resides. And yes, we should defragment our VM's NTFS volumes - even if only the most fragmented files.

Paulo Meireles

Reply
0 Kudos
TomHowarth
Leadership
Leadership

No you do not always defrag. especially now that yuo can used thinprovised disks and Linked clones (View) a Defrag touches every sector on a disk, the net result of this is that all your guest woud suddenly grow to the size of the disk. IE your thinprovisioned 40GB VMDK with used to takeup 15GB would sudenly take up 40GB. all your linked clones taking up 1.5GB of space would suddenly be the size of your Master Replica.

So I would say No a defrag is not necessary. it is horses for courses here.

If you found this or any other answer useful please consider the use of the Helpful or correct buttons to award points

Tom Howarth VCP / vExpert

VMware Communities User Moderator

Blog: www.planetvm.net

Contributing author for the upcoming book "[VMware vSphere and Virtual Infrastructure Security: Securing ESX and the Virtual Environment|http://my.safaribooksonline.com/9780136083214]”. Currently available on roughcuts

Tom Howarth VCP / VCAP / vExpert
VMware Communities User Moderator
Blog: http://www.planetvm.net
Contributing author on VMware vSphere and Virtual Infrastructure Security: Securing ESX and the Virtual Environment
Contributing author on VCP VMware Certified Professional on VSphere 4 Study Guide: Exam VCP-410
Reply
0 Kudos
Texiwill
Leadership
Leadership

Hello,

Moved to the Virtual Machine and Guest OS forum.

I would only defrag if A) you are not using a linked clone, B) you are not using a snapshot, and C) are not using thin provisioning. A defrag as Tom stated would make a hash of thin provisioning, linked clones, and snapshots. THe disks would grow to 'allocated' size elliminating the benefits of these technologies.


Best regards, Edward L. Haletky VMware Communities User Moderator, VMware vExpert 2009, DABCC Analyst[/url]
Now Available on Rough-Cuts: 'VMware vSphere(TM) and Virtual Infrastructure Security: Securing ESX and the Virtual Environment'[/url]
Also available 'VMWare ESX Server in the Enterprise'[/url]
[url=http://www.astroarch.com/wiki/index.php/Blog_Roll]SearchVMware Pro[/url]|Blue Gears[/url]|Top Virtualization Security Links[/url]|Virtualization Security Round Table Podcast[/url]

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
Reply
0 Kudos
RParker
Immortal
Immortal

You just proved my point, without realizing it... You are suggesting that defraging the file will somehow move all of the file in to the first quadrant on the san. And we are telling you this is IMPOSSIBLE

I didn't say the SAN I said the VM. 1 block of data in a VM occupies 4 different blocks of data on the SAN. How can the SAN know the contents of that block of data, it CANNOT! That's impossible. SAN's manage DISKS, not files.

OS manage files not disks. The Quadrant I was referring to is inside a block of data in the VM. If you split a file among different blocks of data INSIDE the VM, and those blocks of DATA correlate to DIFFERENT blocks of DATA on the SAN, you are reading multiple blocks on the SAN to get at one file, true or false?

I am saying that if you simply defrag the VM, and move that 1 file to the SAME block INSIDE the VM, that would mean that 1 BLOCK of data MUST then only reside in one place on the SAN, and by reading ONLY 1 block of DATA from that SAN that MUST be more efficient than reading MULTIPLE areas of the SAN.

And SAN's do not defrag, you need to talk to a programmer that writes file systems. They will explain how it works. A SAN cannot see past a block of data. So how does it know which blocks go with other blocks if the contents of the blocks are foreign? Explain that.

And to discount wholly and without any idea of NEW technology you can't simply ignore the fact that technology changes every single day, and to say that Defrag offers ZERO benefit across the board, taking into account different machines, different SAN, and different controllers, is a complete disregard of technology. You don't know EVERY software component and every system to say that unequivocally that Defrag offers NO benefit. Every has a different situation.

Someone also said that defragging causes more performance issues, yes, that's BECAUSE (think outside the box here) those VM' NEED to be defragged, yes it will cause more activity on disks.... AT FIRST. Once those VM's are aligned, defragged or whatever, THEN activity will be minimal.

That's like saying it's too much work to move the furniture in the house to clean the carpet, lets just leave it. There is going to be MORE work to do things right, but if you do it RIGHT things will be better.

Reply
0 Kudos
RParker
Immortal
Immortal

Ok I did the defrag / disk benchmark test. The results? Drum roll please ...... better performance. Linear read before the defrag was 68 MB/s after it was 86.5 MB/s. That's a big improvement. I'm doing it on another VM now. It has 2 disks. I'd do a write test but it's destructive and these are production VMs I'm testing on.

Finally the voice of reason! Thanks Ej, I owe you a lunch!

Thread close, have a nice day!

Reply
0 Kudos
RParker
Immortal
Immortal

The librarian says "Got it! I will pace the book on shelf 1. Your work is done, so you go home. But the librarian has other plans... The librarian has a much better way of organizing books, so she decides to place it on shelf 83. But what if you come back and want that book again?

OK, continuing with your great analogy then, the pages in the book, assume that librarian can't read. She is organizing based upon what..... How would she know which books belong to what shelf UNLESS she can read the books and tell HOW they should be organized. Do you put books out there by type, what if it's a mystery thriller? Does it belong in mystery or thriller? This can go on forever...

Do we organize by subject, author? I know libraries are organized for human readability and alphabet, and a computer doesn't need it, but at what level does a computer organize data, if that DATA isn't known. It treats a block like a book, and a computer can't READ that book so how is it going to know how to DEFRAG (which is a file level, operating system function not a SAN function) a block of data. ALL a SAN does is move blocks based upon activity, such as a 14 disk array and 1 disk is getting more IO than the rest, so it splits the data off.

But just because that book moved, the contents the pages, the words INSIDE that book CAN be organized better, and YOU just proved my point. You moved my block of data and that's fine, for YOUR benefit, but within that block I want to better organize things for MY benefit, hence that's a VM defrag!

So just because you moved that big and the librarian has to go and find it for me, STILL doesn't mean I can't reorganize the pages inside that book for better readability, that's what I am saying.

If I am a teacher, and I have MANY books my students read, I want to reorganize ALL my books on your shelf to make them better for my students and you are simply stating that just because my books are scattered, that somehow I can't benefit for making it easier for my students to read?

I think not.






!http://s254920738.onlinehome.us/resources/VMW_Q109_LGO_vExpert_k.jpg /!

Reply
0 Kudos
pdrace
Hot Shot
Hot Shot

I've seen real world results on vms that were heavily fragmented.

There was no measurable increase in performance after a defrag.

The

other thing to consider with defragmenting vms is that it will cause a significant increase

in snapshot deltas and replication times if you are using hardware snapshots and replicating them off site.

We use NetApp and stopped running vm defrags over 6 months ago because of this.

I have my doubts about Diskeeper claims as they have a vested interest in saying you should defrag vms.

Reply
0 Kudos
RParker
Immortal
Immortal

We use NetApp and stopped running vm defrags over 6 months ago because of this.

I have my doubts about Diskeeper claims as they have a vested interest in saying you should defrag vms.

Yeah and that's why Disk Keeper has a disclaimer in their instructions that you should not use defrag with snapshots..... That's part of the reading material and things to do BEFORE you do anything with a computer, talk the spanish guy, Manual!

The point to this entire discussion isn't whether or not Defrag is good for every environment, obviously it's not, but it DOES have merit, and that's all I was trying to illustrate. To say that it DOESN'T offer any benefit to anyone is false.

Reply
0 Kudos
RParker
Immortal
Immortal

No you do not always defrag. especially now that yuo can used thinprovised disks and Linked clones

And yes I did say ALWAYS, but I also said recommended, and just because it is a recommendation doesn't mean it works for every situation.

Any encouragement to do something to improve performance is a good idea, but some people do need to take into account impact.

Reply
0 Kudos
murphyslaw1978a
Contributor
Contributor

Everyone,

I hope I don't get yelled at for this being my second post & opening up an old thread to boot, but...I must say that I thoroughly enjoyed reading this from start to finish. I do, however, also wish to contribute as follows:

My SAN engineer said that there is a table mapping that knows which physical disks have which logically addressed sectors on the LUN. So, if I defragment my VM guest, I will be affecting performance on the guest (most likely favorably, as rparker suggested), and I can also defragment the host (which would be the ESX server). In any event, the physical location on the disks that make up the RAID group is a non-dynamic mapping to the logical address of the LUN. So defragmenting on the host layer would have similar effects as it would on the guest layer. In other words, in general, defragmentation is good and provides benefit in both cases.

Now, what makes this possible on our side in non-thinly-provisioned disks, and the SAN isn't dynamically growing/moving/skrinking disks to other RAID-groups or disks.

(ESX 3.5U2, FC-connected, VMDKs, Hitachi (HP) SAN)

Reply
0 Kudos
continuum
Immortal
Immortal

just a little side-note ...

when ever you have to recover files from a corrupted VMFS-volume you have much better chances if the guests have been defragmented recently

___________________________________

VMX-parameters- VMware-liveCD - VM-Sickbay


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
JaySMX
Hot Shot
Hot Shot

Great read. I'm still confused, though what Trainee is saying is what I believe to be true.

I'm having this exact arguement regarding a VM that a user is running an application on. The VM is not very fragmented and there is no real performance issue with it, but the user sees 'defragmentation' within Windows so I'm being asked to take it down for maintenance to perform a defrag. I don't believe this is necessary and frankly I think it's a waste of my time, so I'm looking for documentation to prove or disprove my opinion.

Anyway, thanks for the great debate!

-Justin
Reply
0 Kudos
RaameshKeerthi
Contributor
Contributor

Hi,

frquently Defragmenting the disk will increase the life our disks and then keeps the data blocks together so, that space will be consumed...

Regards

Raamesh Keerthi N.J

Raamesh Keerthi N.J.| http://www.twitter.com/raameshkeerthi
Reply
0 Kudos
RaameshKeerthi
Contributor
Contributor

hi Continuum

I am developing a code for defragmenting the disk but its providing error for me


Error occured while defragmenting the disks...com.vmware.vim25.NotSupported

code as follows

try

                {
                    System.out.println("Going to defragment the disk...");
                    service.defragmentAllDisks(mor);
                    System.out.println("Defragmentation process is completed...");
                }
                catch(Exception ex)
                {
                    System.out.println("Error occured while defragmenting the disks..."+ex.toString());
                }

but exception has been thrown any idea?

Regards

Raamesh Keerthi N.J

Raamesh Keerthi N.J.| http://www.twitter.com/raameshkeerthi
Reply
0 Kudos
EdWilts
Expert
Expert

RParker wrote:

Defrag is ALWAYS recommended.

NetApp does not recommend defragmenting.

The answer, as always, is "it depends".  There are a lot of factors involved, from NFS vs block, to think versus thick provisioning, to the types of files the applications are writing out and how often.

I can give you a lot of cases - and have some here - where defragmentation is a horrible waste of time.  I can also give you a lot of cases - and have some here - where defragmentation is absolutely a benefit.

.../Ed (VCP4, VCP5)
Reply
0 Kudos
CQuartetti
Hot Shot
Hot Shot

Here's a related thread with a question about defragmenting virtual HDs residing on SSDs: Workstation 10 and SSD awareness?

See the reply from @jameslin  dated Feb 8, 2017 which links to the SuperUser question Will fragmentation of a virtual machine's disk also cause host OS disk fragmentation?

One aspect which I don't think is covered here is that reading fragmented files from a virtual HD requires more I/O operations which add overhead regardless of where the file blocks lie.

Reply
0 Kudos