VMware

This Question is Answered

8 Replies Last post: May 7, 2009 10:14 AM by RParker  

How to improve NetApp dedup efficiency with VMDK's posted: May 7, 2009 8:49 AM

Click to view Kevin Gao's profile Hot Shot 206 posts since
Mar 27, 2008

Relatively new NetApp user here. We slapped 2 VM's totallng about 700GB into an NFS volume and deduped it. The VM's are both Windows 2003 server's containing user files and also media files (ton's of pictures / videos). After deduping we got 34-35% space savings from deduplication. NetApp always claims they can do 50% minimum and usually they see 70%+ for deduping VMware VMDK's. Are there any other NetApp shops out there that can shed some light why our's just doesn't dedup well? Is there anything I can do on our VMware side to increase the efficiency?

Thanks a bunch in advance!

PS: 2 small dev VM's I chucked into a NFS dev volume only managed 7% dedup savings...so I'm obviously missing something here. :(

Click to view RParker's profile Champion vExpert 5,787 posts since
Dec 6, 2006
NetApp always claims they can do 50% minimum and usually they see 70%+ for deduping VMware VMDK's. Are there any other NetApp shops out there that can shed some light why our's just doesn't dedup well? Is there anything I can do on our VMware side to increase the efficiency?

Keep in mind those are NOT guarantees, it only matters within your VM's if they are CLONEs, that's one thing. If they are built separate, then maybe there isn't enough duplicated data.

Also one more thing with only 2 VM's that's hardly enough information to 'de duplicate'. The more VM's you have the more duplication you will get. When you clone these THEN you sill see more duplication AND then move closer to de duplicated 50% or greater.

http://s254920738.onlinehome.us/resources/VMW_Q109_LGO_vExpert_k.jpg

Click to view mcowger's profile Virtuoso vExpert 2,200 posts since
Aug 22, 2007
Well, heres the thing. Videos and pictures are usually already compressed, so they already have most of their redundancy removed (thats really what compression is). Then netapp dedupe comes along and tries to remove redundancy again, but its already gone, so theres very little to regain.

I believe that if you dont achieve their 50% number they buy you the disk to make up for it, yes?






--Matt
VCP, vExpert, Unix Geek
Click to view RParker's profile Champion vExpert 5,787 posts since
Dec 6, 2006
Sorry I thought they guarenteed it:

Ah well if they have a guarantee in writing.. well then as MGower says they should buy you some new disks!

Also the guarantee says this:

You must also use the following NetApp features:

* Deduplication
* RAID-DP®**
* Thin provisioning
* NetApp Snapshot™

Thin provisioning that's another caveat, which implies cloning... So you have to first setup your VM's that are cloned as thin provisioned disk :). So that's how those sneaky peets can guarantee 50% or greater de-dupe. They cheat!

http://s254920738.onlinehome.us/resources/VMW_Q109_LGO_vExpert_k.jpg

Click to view mcowger's profile Virtuoso vExpert 2,200 posts since
Aug 22, 2007
Only if the data within that video begins and ends on the block boundry.

So say you take a 100 MB video, and break it up into 1MB blocks. You now have 100 blocks of 1MB. Copy that file. It is 100% identical, so you will get great dedupe. Now, take your copied file, add 10 extra bytes of data to the beginning. This shifts everything 'down' by 10 bytes. Now, all your blocks of the copied file are 'offset' by 10 bytes from the original, and not 1 of the blocks are the same as the original. By adding 10 bytes, you've killed the dedupe. Further, because its compressed video, none of the blocks WITHIN the file are the same (that was the point of the compression), so you gain nothing there either.

Now, itsnot quite as bad as the example I just gave, but this is a fundamental problem with all fixed-block deduplication efforts (like NetApps). Other implementations dont have this problem because they use sliding windows for their block sizes (DataDomain, shortly ZFS), etc.






--Matt
VCP, vExpert, Unix Geek
Click to view RParker's profile Champion vExpert 5,787 posts since
Dec 6, 2006
Also to go along with what MGower says it's almost impossible to de-dupe a picture. If you copy a picture, and modify it, ALL the blocks in that picture have changed, because how can you tell the difference in the way a picture looks unless you visually look at it. So pictures are the most difficult to identify duplication, so compression is one, but pictures are very hard to compare.


http://s254920738.onlinehome.us/resources/VMW_Q109_LGO_vExpert_k.jpg

VMware Beta Programs

Want to be Considered for Future Beta Programs?

Learn More

VMware Developer

Download SDKs, APIs, videos,
training, and more in the Developer community.

Learn More

Developer
Sample Code

Increase your developer productivity with VMware API sample code.

Learn More

VMworld
Sessions & Labs

Online access to the latest VMworld Sessions & Labs and online services.

Learn more

Purchase PSO Credits Online

Purchase credits to redeem training and consulting services online.

Buy Now

Community Hardware Software

View reported configurations or report your own.

Learn More

Only VMware ... Delivers Nexus 1000V

Ensure consistent, policy-based network capabilities to virtual machines across your data center.

Learn More

Communities