Space Utilization Graphs in Datastore Performance Charts :
Space consumption of a VMFS Datastore in OTHER category reflects high usage (50 GB) and it spiking for every 2 hours from 10 to 50 GB. How to identify/determine which files are used in OTHER category and why there is sudden spike in usage/size.
vCenter alarm is triggered when it reaches 90 % of datastore usage for one LUN(datastore_lun210) which is attached to 3 ESXi and it contain 10 VMs , please refer the below VMware KB for more understanding.
Attached Image is for example : But in my case, it reaches 50 GB in OTHER category.
I would two two things;
Do you have 3rd party storage or backup tools in place that use this datastore for any processes?
Thank you for the response…
I have checked it and not found any zombie VMDK’s and there is no ISO Images on datastore. Even though, if any ISO Images are loaded and there should be not be spike in size from 10 to 50 GB because ISO will be maintain the standard size?
Let me check with the backup team and confirm whether any 3rd party storage or backup tools in place for this datastore. *** It’s a really good point****
Are you seeing the utilization spike for a brief period of time then return to normal? We are running into the same problem but do not yet have a handle on the cause. For us no backups, snapshots or thin provisioned VMs on the datastores. Only pattern we are seeing is the two problem datastores are 2TB with very large VMs residing on each. VMware support wasn't able to figure it out either. They guessed the HBA driver is sometimes returning bad data to vCenter.
I would have checked the following .... from the command line check for the space before the spike ..... when the spike is up then check for the size again
Verify which file is consuming more -- do a comparision
Exactly, its same symptoms and reoccurring every day. The utilization spike is for 10 – 20 mins then its returns to normal. Apart from guessing, did VMware support provided any troubleshooting or way to nail down the issue ? Thanks for your inputs.
Yes, I thought the same, it makes sense. But do you have any script to collect the file size of 2GB, 5GB, 10 GB, 20GB & >30GB in a CSV format? So that we can easily compare it or any better way to figure it out? In some cases, there might be new files in datastore.
I do not have a script that I can provide you with however you can export the data using this command """"ls -l > print.csv""""""
you can also export the data of the directories in the datastore and then compare the same
Thanks buddy for your suggestion, We can use FIND command to locate all files matching a given criteria. For example, to find files that are larger than 1024MB(1GB) without traversing mount points.
find /vmfs/ -size +1024M -exec du -h {} \; | less
Let me try again with all suggestion in the post.
They had me setup a script + cron job to try and capture information on the datastore. The cron job runs every 5 minutes and outputs to a file. So far the output has not shown us any indications that unknown files are being stored on the datastore or files are expanding/shrinking mysteriously.
Example cron job:
[root ~]# crontab -l
*/5 * * * * /checkLun.bash 500eb009-b0e52168-ea3e-b499baa6452a >>/var/log/500eb009-b0e52168-ea3e-b499baa6452a.log 2>&1
*/5 * * * * /checkLun.bash 500eb045-978a8ccd-db61-b499baa6452a >>/var/log/500eb045-978a8ccd-db61-b499baa6452a.log 2>&1
Aaron,
Thank you for your prompt reponse. Appreciate your time and consideration :smileygrin:
Finally Issue is resolved 🙂
We have migrated all VMs to other datastore to monitor the LUN, but still we got alerts on the datastore. So only option, we have destroyed the LUN and recreate the Datastore. Now the alerts are stopped and no issues. But cause is UNKNOWN and something triggered from storage side.