UPDATE : Mistake corrected with red color
I use VDR 2.0.0 with ESXi 5 HA cluster created with 3 hosts and FC SAN acting as datastore.
I've tried to backup VM with two vmdks - first 102GB and second 1,6TB. VDR does machine snapshot but strangely two delta files for each disk appear (00001-delta, 00002-delta) rather the only one (?). After that, it starts copying data from second disk then stops at 102GB and hungs up for a two/three minutes. Then for second disk new delta file appears (00003-delta) for a moment and right after VDR remove snapshot. ESX host complains that VM disks need consolidation but it gives an error that <unspecified file> is locked and can't finish it.
Running lsof on the machine hosting that particular VM says that deltas are being used by vpxa-worker process, so finally after killing it, consolidation is possible.
Backing up the VM is all the same wherever it's running - on the same ESXi host as VDR or the other. Another machines, e.g. GNU/Linux with 1,6TB vmdk backs up successfully.
Could somebody have any idea what's going on ?
Any idea highly appreciated.
In the VDR logs message "Trouble reading files, error -3956 (operation failed)" appears and in vcbAPI-xx.log it claims that :
2012-07-11T08:16:22.967+02:00 [4C805940 warning 'Default'] The change associated with a property collector field parent is NULL
2012-07-11T08:16:22.968+02:00 [4C805940 info 'Default'] vcbAPI::propCollWrap::doHostSystem: Detected a availability state change for host 192.168.x.1: The state is now available.
2012-07-11T08:16:22.979+02:00 [4C805940 info 'Default'] vcbAPI::propCollWrap::doHostSystem: Detected a availability state change for host 192.168.x.2: The state is now available.
2012-07-11T08:16:22.988+02:00 [4C805940 info 'Default'] vcbAPI::propCollWrap::doHostSystem: Detected a availability state change for host 192.168.x.3: The state is now available.
2012-07-11T08:16:23.030+02:00 [4C805940 info 'blocklist'] Creating snapshot
2012-07-11T08:16:53.394+02:00 [4C805940 info 'blocklist'] Snapshot created, ID: snapshot-3490
2012-07-11T08:16:53.471+02:00 [4C805940 info 'Default'] Establishing NFC connection to host 192.168.x.3 on port 902, service vpxa-nfc
2012-07-11T08:16:53.890+02:00 [4C805940 info 'Default'] VcbAPI::Snapshot::Open: Attempting to access hot-add mode, in parallel mode.
2012-07-11T08:16:53.890+02:00 [4C805940 info 'blocklist'] PrepareForAccess: getting a hot-add instance in parallel mode
2012-07-11T08:16:53.903+02:00 [4C805940 error 'blocklist'] This VM [N3Vim18VirtualMachineStubE:0x61e4810] is not licensed for HotAdd
2012-07-11T08:16:53.903+02:00 [4C805940 info 'Default'] VcbAPI::Snapshot::Open: Exception trying to HotAdd: HotAdd is not licensed
2012-07-11T08:16:53.929+02:00 [4C805940 info 'blocklist'] Will remove all disks belonging to VM "VIRTUAL MACHINE"(vm-3274), Snapshot "snapshot-3490".
2012-07-11T08:16:53.948+02:00 [4C805940 info 'blocklist'] Remove clone disks successful.
2012-07-11T08:17:01.554+02:00 [4C805940 info 'Default'] VcbAPI::Snapshot::GetUnusedFileInfo: Could not obtain file blocks for "pagefile.sys" on one of Partitions of path: "[datastore] VIRTUAL MACHINE/VIRTUAL MACHINE-000002.vmdk", error: 32769
2012-07-11T08:17:01.555+02:00 [4C805940 info 'Default'] VcbAPI::Snapshot::GetUnusedFileInfo: "pagefile.sys" removed from volume "[datastore] VIRTUAL MACHINE/VIRTUAL MACHINE-000002.vmdk"
2012-07-11T08:17:01.748+02:00 [4C805940 info 'Default'] VcbAPI::Snapshot::GetChangeInfo: Processing 1045 changes
2012-07-11T08:17:02.117+02:00 [4C805940 warning 'Default'] VcbAPI::Snapshot::GetChangeInfo: Cannot look for changes past the end of the disk, disk size: 109889715200, offset: 109889715712.
2012-07-11T08:21:30.033+02:00 [4C805940 error 'Default'] vcbAPI::Snapshot::Read: Main read failed for disk [datastore] VIRTUAL MACHINE/VIRTUAL MACHINE_2-disk-000002.vmdk, error: 2338
2012-07-11T08:21:30.041+02:00 [4C805940 error 'Default'] RealPartitionReader: File i/o error ($1)
A few new facts about the issue :
- VDR has been upgraded from 2.0.0.x to the latest 126.96.36.1994
- Tried to disable and re-enable change block tracking for VM
I almost sure that the mentioned stuck in VM backup after 102GB it's connected with successfully finished backup of first disk (the disk alone backups successfully) and then VDR tries to copy the second one but fails.
It's strange that VDR creates one snapshot at start (at least one message appears in vSphere client) but two delta/ctk files appear for every disk. Moreover the data are really being written to first deltas only (modify time changes) and second delta/ctk files for second disk only are opened/locked by vpxa-worker process. Why ?
Then just before fail, VDR creates third delta/ctk files but second ctk file disappears and finally consolidation can't be done.
Could somebody tell me why does it create two delta/ctk and what does vpxa-worker process do with second delta/ctk for second disk ?
Altough this post is one year old, ssh to VDR appliance and editing /var/vmware/datarecovery/datarecovery.ini and setting BackupUnusedData=0 under [Options] could help on this error, you should stop datarecovery service before make the changes ( /etc/init.d/datarecovery stop )