We are running ESXi 7.0 Update 2 on a Dell server. We have a hardware RAID10 datastore which ESXi says has a reported capacity of 14.43 TB, of which 9.79 TB is provisioned and 4.64 TB is free.
This datastore contains 1 single VM (used for making backups). It has 2 virtual disks. Disk #1 is 16GB thick provisioned and disk #2 is 14TB thin provisioned. The VM reports under "resource consumption" that 9.56TB of the provisioned 14.02TB is used.
So plenty of space on the datastore right? Wrong. Every day within a pretty specific time frame when this VM makes new backups the VM locks up completely with the error message/question which goes:
"There is no more space for virtual disk 'xxxxx_1.vmdk'. You might be able to continue this session by freeing disk space on the relevant volume, and clicking Retry. Click Cancel to terminate this session."
If you click "Retry" the VM comes back up again for a while and the question will reappear. Sometimes immediately, sometimes later. If you do nothing or are not around to click anything, the VM will come online again as well and unfreeze itself after a few seconds or minutes. After the backups are finished (and the VM is not writing anything to it's virtual disk) this problem is also over.
I have read other posts that suggest the problem might have to do with thin provisioning and that the virtual disk commitment might in theory be too big for the datastore. But still it seems very buggy that I would be allowed to overcommit a thin provisioned disk and that it reports this error with an actual 4.64TB free on the datastore. Also I don't think it's really overcommitted because the datastore should be 14.43TB while the provisoned space of the VM is 14.02 TB.
This started happening immediately after the setup of this VM and ESXi host which had brand new disks. So I'm kind of doubting it's a hardware issue.
Since this VM is too large to move, my only option is start over with a thick provisioned disk and see what happens, unless on the off chance somebody can solve this issue for us.
Also posting this in case it may help somebody else dealing with this issue in the future, because I have a suspicion this is actually a bug in ESXi itself.
@benkeprashant
the kb you linked does not apply to VMFS 6 used by ESXi 7
stat -f /vmfs/volume/datastorename
lists the total number and number of free inodes but the numbers it shows are pure fiction.
In other words: DO NOT TRUST stat -f for VMFS-volumes
I did check the KB article, checked on inode usage and posted it in this thread. I don't think it's the issue.... Also inodes usage on the VM itself is fine
So unless somebody knows a definite solution to this issue that involves a quick and easy fix on the ESXi host, I'm going to:
- purge this VM from the datastore as soon as I can (will take about a month)
- set up 2 or 3 VM's with THICK provision on this host at some point after that, again totaling around 14TB.
- wait and see if this issue happens again.
If it doesn't, I'm going to assume a bug with thin provisioning and maybe overcommitting virtual disk space, cutting it to close to the actual datastore size.
If it does happen again, I'm going to have to look further into it, but then it might be a weird hardware issue or something.
> Also inodes usage on the VM itself is fine
Correct - actually ESXi 7 expands the number of available inodes when ever it is running short of free inodes.
And it does so without any mercy and without using even a bit of common sense.
