We have detected that on ESXi 7.0.0 build-15843807 -flat.vmdk files are locked for reading even though the VM is running on top of a snapshot.
[root@wxpmk:~] cat /vmfs/volumes/data/CentOS-7.2/CentOS-7.2.vmsd
.encoding = "UTF-8"
snapshot.lastUID = "13"
snapshot.current = "13"
snapshot0.uid = "13"
snapshot0.filename = "CentOS-7.2-Snapshot13.vmsn"
snapshot0.displayName = "1"
snapshot0.createTimeHigh = "369657"
snapshot0.createTimeLow = "387876330"
snapshot0.numDisks = "1"
snapshot0.disk0.fileName = "CentOS-7.2.vmdk"
snapshot0.disk0.node = "scsi0:0"
snapshot.numSnapshots = "1"
[root@wxpmk:~] dd if=/vmfs/volumes/data/CentOS-7.2/CentOS-7.2-flat.vmdk ibs=1 skip=0 count=1
dd: can't open '/vmfs/volumes/data/CentOS-7.2/CentOS-7.2-flat.vmdk': Device or resource busy
Nonetheless when we query the file lock info, the file appears as: lockMode: Read-Only
vmfsfilelockinfo -p /vmfs/volumes/data/CentOS-7.2/CentOS-7.2-flat.vmdk
vmfsfilelockinfo Version 2.0
Looking for lock owners on "CentOS-7.2-flat.vmdk"
"CentOS-7.2-flat.vmdk" is locked in Read-Only mode by host having mac address ['00:0c:29:d5:ea:36']
Trying to use information from VMFS Heartbeat
Host owning the lock on file is 192.168.3.240, lockMode : Read-Only
Total time taken : 4.1909195680054836 seconds.
We don't know if this is the intended behaviour or it is a bug, as there seems to be some contradictions in what we could find out so far.
Thank you for your answer Scott.
As stated in the initial opening question, the disk does have a snapshot applied to it.
So placing the matter the other way around:
Should the -flat.vmdk file be not readable, even though the virtual disk has a snapshot, be considered a bug?
Or, blocking reads on -flat.vmdk files that do have a snapshot above is a new design approach from part of VMWare?
Thanks on advance
I have not seen or heard of anything specific to vSphere 7 about the snapshot behaviour, but given that the host running the VM only needs read access to the parent disk (flat.vmdk) after a snapshot has created a child disk I see no reason to be concerned.
Create a second-level snapshot and see if the child disk from the first-level snapshot also has a read lock applied to it by the host.
My guess is that the lock is a preventative measure which reduces the chance of snapshot corruption.
flat.vmdks should not be locked when the VM has an active snapshot.
But the locking procedures dont work as reliable as they are supposed to so I assume you hit a flaw.
I really must apologize Scott. I believe my English (it's not my mother tongue) is not good enough to transmit the idea.
As posed in my two previous posts, the VM has indeed created a snapshot and therefore a child disk.
I have also tried to perform the same operation (a short read) on the -flat.vmdk file after taking a second snapshot, still the same behaviour is observed.
I am not talking about the snapshot files, which, of course, must be protected to avoid corruption. I am insistingly referring to the -flat.vmdk file.
Thank you Continuum:
AFAIC the -flat.vmdk lock has been released quite reliably for us in all previous versions. I believe this is a bug, nonetheless there are other type of concerns related to this eventual bug, as keeping the -flat.vmdk file locked would prevent users from backing up their -flat.vmdk files from the shell while the VM is on.
forget my last post.- I spoke too early and should have looked into it first.
You are right
This is the second radical change in behaviour of the new VMFS-version that comes with ESXi 7 that I notice.
First one is "expanding systemfiles on the fly"
Both behaviour changes are completely unexpected and both of them make zero sense.
I reported the first one before the release but did not noticed this one until now.
VMware should explain the new functionality.
Sorry to bother you again Continuum, but I believe this to be relevant for the thread.
As per our tests and the feedback provided by our users, this blocks on the flat.vmdk files (always when the VM is running on a snapshot) observed in ESXi 7.0.0 seem to be random, which is double puzzling. What I mean is that some -flat.vmdk files are read just fine, while others are blocked, this is in the same host and under the exact same circumstances.
This opens again the door to believe that it could be a bug instead of intentional, although being intentional would still make sense, some apparently random behaviour points at a possible bug.
(*) I deleted my previous post, as it doesn't make sense to offer my opinion on something that I still can't grasp to comprehend.
I'm afraid that it's unfortunately not a bug, but by idesign.
From this weeks Veeam newsletter:
... The issue is caused by a change in vSphere 7: it now locks ALL delta files in the snapshot chain for exclusive access. Previous vSphere versions did not lock read-only delta files exclusively, ...
From that, I assume that all .vmdk files (flat, delta, sesparse) in the snapshot chain are locked now.
Nice find... but if this were to be an intentional behavior how would backup products behave then? Is it different when requesting to read it using VADP.... ?
As a backup admin, I get a little bit concerned even though we're not running vSphere 7.... yet.
Well probably Veeam has first hand information, or maybe they just observed the same thing.
Vmkfstools being able to access those -flat.vmdk files without issue is an indication that the new behaviour is there by design.
But, why would VMWare do such a nasty thing?, ESXi is in the end an OS.
The new locking procedures also disables several of the workarounds we used to repair VMs with previous versions.
This does not only affect users of the free ESXi but it seriously affects all ESXi users.
Another example: in previous versions we could create linked clones manually - that will no longer work.
Just some more good arguments on top of the table. I believe whoever is taking this decissions has not thought about it twice, nonetheless he's probably smarter than me; thus he might have thought about it thrice, now the thing is that the vast majority of sysadmins will just do it once.
Just a curious thing to add: some of our customers who are trying our software on ESXi 7.0.0 report that the blocking doesn't happen everytime. In some cases they are indeed able to access the -flat.vmdk file and back it up.
Looks like vmkfstools -i is one way to work around the lock.
Did you find a second way to work around the lock ?
That sounds promising - please investigate those cases and keep us updated.
I am looking into releasing the locks by editing the heartbeat-section. But that will be a workaround which is not suitable for regular backup jobs.
Not yet, running vmkfstools on some .vmdk file to then send a SIGSTOP to the process seems too awkward to be taken into consideration, at least to me (update: tried, it didn't work).
I will try to find out why those files are blocked in a seemingly random way. I will share all that I can find here, nonetheless you seem to have a deeper knowledge of ESXi than I do.
> I am looking into releasing the locks by editing the heartbeat-section. But that will be a workaround which is not suitable for regular backup jobs.
No hope on that road : no chance to manipulate the locks while the VM is running. Tried it but both the device and the .vh.sf are readonly for commands like dd.
What about releasing the locks before switching the VM on?. If this is to be done in the hearbeat.section, I believe it won't be valid for clustered scenarios, but could very well serve spare ESXi hosts systems.
Also it's worth noting that the physical disks can be read one byte at a time, which is useless for backups, but may be useful for other situations.
I was able to get some .vmx files from the VMs that are being backed up as well as the ones that are not.
I suspect that some variable in the .vmx file can be the reason why some -flat.vmdk files are not being locked.
I will perform a comparative study on this.