VMware Cloud Community
anthonie_rozari
Contributor
Contributor

Space Utilization Graphs in Datastore Performance Charts

Space Utilization Graphs in Datastore Performance Charts :


Space consumption of a VMFS Datastore in OTHER category reflects high usage (50 GB) and it spiking for every 2 hours from 10 to 50 GB.  How to identify/determine which files are used in OTHER category and why there is sudden spike in usage/size.

vCenter alarm is triggered when it reaches 90 % of datastore usage for one LUN(datastore_lun210) which is attached to 3 ESXi and it contain 10 VMs , please refer the below VMware KB for more understanding.

  1. Need a quick solution to find out the cause of sudden  spike in OTHER category
  2. Which file is consuming more space and how to find the size of it.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=200309...

Attached Image is for example : But in my case, it reaches 50 GB in OTHER category.

Anthonie Rozario M.Sc., M.Phil., M.A., B.Ed
11 Replies
jrmunday
Commander
Commander

I would two two things;

  1. Run RVtools (http://www.robware.net/) against your environment - check the Health tab to see if there are any zombie VMDK's on this datastore
  2. Manually browse the datastore to check for any ISO images etc.

Do you have 3rd party storage or backup tools in place that use this datastore for any processes?

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
anthonie_rozari
Contributor
Contributor

Thank you for the response…

I have checked it and not found any zombie VMDK’s and there is no ISO Images on datastore. Even though, if any ISO Images are loaded and there should be not be spike in size from 10 to 50 GB because ISO will be maintain the standard size? 

Let me check with the backup team and confirm whether any 3rd party storage or backup tools in place for this datastore.   *** It’s a really good point****

Anthonie Rozario M.Sc., M.Phil., M.A., B.Ed
Reply
0 Kudos
aaronwsmith
Enthusiast
Enthusiast

Are you seeing the utilization spike for a brief period of time then return to normal?  We are running into the same problem but do not yet have a handle on the cause. For us no backups, snapshots or thin provisioned VMs on the datastores. Only pattern we are seeing is the two problem datastores are 2TB with very large VMs residing on each.  VMware support wasn't able to figure it out either. They guessed the HBA driver is sometimes returning bad data to vCenter.

Reply
0 Kudos
lvaibhavt
Hot Shot
Hot Shot

I would have checked the following .... from the command line check for the space before the spike ..... when the spike is up then check for the size again

Verify which file is consuming more -- do a comparision

anthonie_rozari
Contributor
Contributor

Exactly, its same symptoms and reoccurring every day.  The utilization spike is for 10 – 20 mins then its returns to normal.  Apart from guessing, did VMware support provided any troubleshooting or way to nail down the issue ? Thanks for your inputs.

Anthonie Rozario M.Sc., M.Phil., M.A., B.Ed
Reply
0 Kudos
anthonie_rozari
Contributor
Contributor

Yes, I thought the same, it makes sense. But do you have any script to collect the file size of 2GB, 5GB, 10 GB, 20GB & >30GB in a CSV format? So that we can easily compare it or any better way to figure it out? In some cases, there might be new files  in datastore.

Anthonie Rozario M.Sc., M.Phil., M.A., B.Ed
Reply
0 Kudos
lvaibhavt
Hot Shot
Hot Shot

I do not have a script that I can provide you with however you can export the data using this command """"ls -l > print.csv""""""

you can also export the data of the directories in the datastore and then compare the same

Reply
0 Kudos
anthonie_rozari
Contributor
Contributor

Thanks buddy for your suggestion, We can use FIND command to  locate all files matching a given criteria. For example, to find files  that are larger than 1024MB(1GB) without traversing mount points.

find /vmfs/ -size +1024M -exec du -h {} \; | less

Let me try again with all suggestion in the post.

Anthonie Rozario M.Sc., M.Phil., M.A., B.Ed
Reply
0 Kudos
aaronwsmith
Enthusiast
Enthusiast

They had me setup a script + cron job to try and capture information on the datastore.  The cron job runs every 5 minutes and outputs to a file.  So far the output has not shown us any indications that unknown files are being stored on the datastore or files are expanding/shrinking mysteriously.

#!/bin/bash
guid=${1}
echo ========== BEGIN ============
date
echo '** vmkfstools **'
/usr/sbin/vmkfstools -v 11 -Ph /vmfs/volumes/${guid}/
echo '** ls **'
ls -alRh /vmfs/volumes/${guid}/
echo '** du **'
du -h /vmfs/volumes/${guid}/
echo =========== END =============
echo

Example cron job:

[root ~]# crontab -l

*/5     *       *       *       *       /checkLun.bash 500eb009-b0e52168-ea3e-b499baa6452a >>/var/log/500eb009-b0e52168-ea3e-b499baa6452a.log 2>&1

*/5     *       *       *       *       /checkLun.bash 500eb045-978a8ccd-db61-b499baa6452a >>/var/log/500eb045-978a8ccd-db61-b499baa6452a.log 2>&1

Are your datastores VMFS 3.33 with 8MB block size?  That's how ours are currently configured.  But we'll be upgrading to VMFS 5.x soon, so I've been holding out to see if the problem persists after we complete our vSphere 5.1 upgrade.
Another possibly helpful command will show you the current size of files vs. the provisioned size:
[root ~]# ls -lsh
The "-s" switch shows provisioned size.  "-h" puts it in human readable format.
This may reveal any non-VM files that have an unusual configuration where provisioned storage is different that storage consumed.

Reply
0 Kudos
anthonie_rozari
Contributor
Contributor

Aaron,

Thank you for your prompt reponse.  Appreciate your time and consideration :smileygrin:

Anthonie Rozario M.Sc., M.Phil., M.A., B.Ed
Reply
0 Kudos
anthonie_rozari
Contributor
Contributor

Finally Issue is resolved 🙂

We have migrated all VMs to other datastore to monitor the LUN, but still we got alerts on the datastore. So only option, we have  destroyed  the LUN and recreate the Datastore. Now the alerts are stopped and no issues. But cause is UNKNOWN and something triggered from storage side.

Anthonie Rozario M.Sc., M.Phil., M.A., B.Ed