I used vmkfstools to clone a virtual disk via an SSH connection to one of our ESXi servers. I unfortunately lost my network sometime thereafter but before it completed. Is there a way I can check the completion status and/or whether it completed successfully? According to the vmkfstools documentation, there doesnt appear to be a way to do so with that command specifically.
Probably going to get me shot but you could use the "md5sum" utility to compare the file hashes and this will let you know if the copy was successful and complete.
Not pretty but should work
md5sum is OK to run in the shell, but will only work if the files are completely identical.
FWIW I have seen vmkfstools commands complete after loosing the ssh connection, but I only knew that because I had commands running after the vmkfstools commands which was called from within a script. So I would expect it to complete. I don't think that there's a log file you can inspect to see if it completed successfully.
I am not aware of any vmkfstools command that you could test with md5sum.
vmkfstools is not supposed to create identical copies - so a check with md5sum would typically result in a mismatch.
Most of the times you would run vmkfstools via a ssh connection so it is a good idea to use
nohup vmkfstools <options> &
just to rule out any bad effects of a disconnected ssh-session.
With nohup comand & the command will continue even if the session gets interrupted.
Not really sure I want to pick at this thread but I think it would be a great learning experience for me
While I would assume that the files would be different (hash mismatch in this case) if you changed the disk format when "cloning" the VMDK, I would have thought that the files would be identitical if you don't change the disk format. I'm gathering this is a wrong assumption and yes it was an assumption.
In the times I've tested the file hash has been the same as long as I don't change the disk format. Obviously they would be different if the disk format changed and, yes I did assume that a disk format change hadn't happened in this case. Sorry about that.
Didn't know about the "nohup" so this is definitely a much better solution!! Something new learnt and certainly useful.
Thanks and kind regards.
> Not really sure I want to pick at this thread but I think it would be a great learning experience for me.
I am glad you did - and I have to admit that I have to review my statements.
Actually my statements are even more based on assumptions which actually have to be tested before I should post stuff like that again.
Let me explain my background ....I spend most of the time I spend with VMware with recoveries and with trying to figure out how VMFS and VMDKs actually work.
In my world a VMDK is a list of fragments on a VMFS-volume.
These fragments are quantized in size: each fragments has the size of at least one MB or a multiple of one MB.
The VMFS metadata offer 4 options for each fragment of one MB:
1. fragment X is allocated on specified VMFS-volume at offset Y. Fragment has been already written to.
2. fragment X is allocated on specified VMFS-volume at offset Y. Fragment has been allocated but so far the guestOS did not write to it.
3. fragment X is an explicit link to /dev/zero (guestOS did not write to it)
4. fragment X is assumed to be a link to /dev/zero (guestOS did not write to it)
In my practical recovery work I have to assume that all of this 4 options are valid.
According to the VMware documentation it rather looks like only the first 3 types are used.
Lets look at the three types of provisioning now:
# Eager zeroed: according to the documentation all of the fragments are specified as type 1
# Lazy zeroed: all fragments use either type 1 or type 2
# Thin: all fragments use either type 3 or type 1
According to the documentation we would expect that a VMDK is either eagerzeroed, lazyzeroed or thin.
Following this assumption we have to assume that the VMFS-volume sets a flag for any flat vmdk and specifies the type as eagerzeroed, lazyzeroed or thin.
Acording to my experience and the practical results I have I think that this flag can be ignored.
In fact I assume that a VMDK does not have to use exclusively one of those 3 types - I rather assume that the types can change from one to another fluently.
Example: once you write to all one-mb-blocks of a thin vmdk it automatically turns into an eagerzeroed one.
In the world of a guest a VMDK HAS to look different.
The guest sees a single fragment of the complete allocated size.
It may use a different blocksize and uses its own metadata to keep track of the used and unused blocks. Windows for example uses NTFS and the masterfiletable.
I hope that I explained this good enough so that you can follow the next step.
A flat.vmdk on a VMFS-volume can be viewed in 3 different ways:
1. View of the guestOS
2. View of vmkfstools (ESXi host)
3. View of a forensic investigator
Now lets look at a newly created completely unused VMDK.
The guestOS sees a completely blank disk - basically only zeroes. No difference between eagerzeroed, lazyzeroed and thin.
The ESXi host sees a bunch of fragments - and if the VMDK has been wiped at creation time (eager zeroed) all fragments contain zeroes only.
If the vmdk uses lazy zeroed provisioning some or all of the fragments can be dirty - as they may still contain data from VMDKs that have been deleted in the past.
The ESXi host can decide wether it wants to read a lazyzeroed fragment as dirty or clean.
A forensic investigator does not have this option. Without the additional info in the VMFS-metadata a lazyzeroed fragment always is more or less dirty.
Hope you can actually follow me ???
Now lets look at my first statement:
vmkfstools is not supposed to create identical copies.
Identical copies would include all the dirt that is caused by lazyzeroed provisioning.
I do not think that vmkfstools operations include this "dirt" - that would be a waste of diskspace and resources.
Instead I believe that a vmkfstools -i operation always optimizes the complete vmdk.
So instead of a creating a 1:1 copy of an unused lazyzeroed fragment which is supposed to be dirty it makes way more sense to create a clean fragment either by zeroing it on the fly or by using a link to /dev/zero.
This reasoning results in my assumption that using md5sum to compare source and cloned vmdk is unreliable.
This assumption is flawed as I realize now.
I assumed that md5sum reads a flat.vmdk in raw mode - like a forensic investigator - but actually never looked at this more closely.
If md5sum reads the flat.vmdk in the interpreted mode that includes the info from the VMFS metadata then all my statements regarding md5sum as a tool to check wether a clone worked are wrong.
Glen - please let me know wether I make myself clear enough.
If you can follow lets figure out some real life tests and see wether md5sum creates a hash of the raw flat.vmdk or of the interpreted raw.vmdk.
And thanks for bringing this up - I will learn a lot in this post.
Whew - I did go cross-eyed at points but managed to follow what you where saying. Not because you were unclear but because you are starting to tax my brain capacity
I agree that vmkfstools would not (should not) be including the dirty blocks. I'm obviously not operating at the level you are, but if we look at what happens on a lazyzeroed VMDK as you have mentioned, when a block goes to be allocated by the guest OS, then the block is zeroed before being written to. The guest doesn't do this (confirm if I'm wrong on this) but rather ESXi. The actual state of the physical block is abstracted from the guest OS otherwise you could provision a VMDK, then use forensic tools inside a VM to extract blocks from the underlying storage.
Until it is written to the VMDK considers all the blocks associated with the VMDK as being effectively zero'd or perhaps better to say, available to be written to. It is possible on a new blank device the physical blocks actually are zero or they could contain dirty data from previously deleted files however from the VMDK perspective all it knows is that it has been allocated those blocks and when a write needs to occur the block is zero'd. (this is more to confirm my understanding is okay than to teach you anything new )
Based on this, then I also agree that it is likely that either vmkfstools, zero's the destination block as it copies it or links to /dev/zero - I think the /dev/zero is far more likely but have no proof. The end result at a file level would be the same, i.e. if the VMDK metadata believed a block is deallocated (not used yet by the VM for data) then it is ready to be zero'd and have data written to it if required.
If we break this apart a little more then it makes more sense for md5sum to operate in interpreted mode (I'd be almost tempted to call it file mode), i.e. only reading the actual data associated with the VMDK not the "dirty" data at the block level even if the actual data is a virtual zero'd block.
Even in a file system like NTFS, there will be slack space between sections of a file which doesn't contain the data from the file. Like you said, this is what forensic investigators love (and the bad guys as well) as you can find some amazing stuff in this supposedly wasted section of the file system. If this slack space was included when calculating file hashes, then we would likely never be able to get a matching hash.
I'm happy to pick this apart a little more and get some real life tests going. Filling in missing spots in knowledge is always good!
Hope I haven't muddled things with my statements above!