cfsullivan
Contributor
Contributor

Disk Consolidation Fails with File Too Large

We are running ESXi 6.0.0 7967664 and our backup solution is Avamar. Recently during backups we started getting the error "....disk consolidation failed..." on one VM. If I try to manually consolidate the snapshots (all created by the backup solution) the error is "Consolidation failed for disk node 'scsi0:5': 27 (File Too Large)".

The relevant VMDK is 400 GB and there is over 1 TB of free space on the data store. There are several xxxxx-00000x.vmdk and respective ctk.vmdk files totaling about 200 GB in that disk's folder in the data store. (I assume new ones get created with each backup attempt.)

When I try to research the issue I getting a lot of stuff regarding locked files. I'm not actually seeing anything about locked files in the vCenter Tasks log.

Any ideas for resolving this? Do I need to look at other logs to get a better idea? Is the "file too large" error misleading?

Thanks.

7 Replies
continuum
Immortal
Immortal

it would be useful to see the latest vmware.log
preferably from a run when you received the error you mentioned.
Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
cfsullivan
Contributor
Contributor

Thanks for the reply. I've attached a snip of the log from the relevant time period.

0 Kudos
continuum
Immortal
Immortal

I asked for the vmware.log because I want to see the content of the vmx-file plus  the latest snapshot operations.
Please attach the full vmware.log


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
cfsullivan
Contributor
Contributor

Okay, it's attached.

Thanks.

0 Kudos
continuum
Immortal
Immortal

Your vmdks live in 2 different datastores - do you have enough free space in both of them ?


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
a_p_
Leadership
Leadership

IMO this looks like a changed block tracking issue, that started during the backup on Aug, 27th. Since that time snapshots are created for backup, but not deleted for that specific virtual disk.

2018-08-27T05:05:48.276Z| vcpu-0| I125: DISKLIB-LINK  : Opened '/vmfs/volumes/5609391a-a0c50700-3f06-0025b5a1101f/codd2/codd2_2.vmdk' (0x20a): vmfs, 838860800 sectors / 400 GB.

2018-08-27T05:05:48.276Z| vcpu-0| I125: DISKLIB-LIB_BLOCKTRACK   : Resuming change tracking.

2018-08-27T05:05:48.278Z| vcpu-0| I125: DISKLIB-CTK   : Could not open change tracking file "/vmfs/volumes/5609391a-a0c50700-3f06-0025b5a1101f/codd2/codd2_2-ctk.vmdk": Change tracking invalid or disk in use.

2018-08-27T05:05:48.280Z| vcpu-0| I125: DISKLIB-CTK   : Re-initializing change tracking.

2018-08-27T05:05:48.280Z| vcpu-0| I125: DISKLIB-CTK   : Auto blocksize for size 838860800 is 512.

2018-08-27T05:05:48.280Z| vcpu-0| I125: OBJLIB-FILEBE : Error creating file '/vmfs/volumes/5609391a-a0c50700-3f06-0025b5a1101f/codd2/codd2_2-ctk.vmdk': 3 (The file already exists).

It could as well be an issue with e.g. a stale Avamar process!?

Resetting CBT (https://kb.vmware.com/s/article/2139574) would be worth a try, but here comes catch-22. In order to disable CBT, the VM must have no snapshots!

Although I can think of steps that may help, I'd recommend that you open a support case with VMware (or the backup vendor, who however will most likely finger point to VMware), and ask for a supported solution to ensure that future backups are consistent.


André

0 Kudos
cfsullivan
Contributor
Contributor

Thanks for the replies.

I ended up fixing this at the OS/application level instead. As it turned out, the disk mostly contained the SQL tempdb files and logs. Once I got the downtime approval, I stopped SQL services and copied the data to a new disk, then switched drive letters between old and new volumes and rebooted.

This was easy because of the type of data, but the same could be applied to almost any kind of data as long as you know the consequences. It's very simple, but solved the problem and was quick.

Since that time we've had two brand new, very small VMs on the same data store which we were not able to migrate to different storage and change disk type without shutting down first. We got a "file too large error"). It seems something is up with that particular data store and I've let the responsible people know.