I've got an ESXi 6.5 (vSphere Essentials kit) server that mounts 3 datastores over iSCSI from a FreeNAS 11 box using 10G Ethernet.
Even when all VMs (including VCSA) are paused or shut down, the ESXi host appears to be performing small writes to each datastore every 5 seconds. This activity doesn't stop until I shut down the ESXi host. When I start the ESXi host again, the write activity starts about mid-way through the "yellow screen" boot cycle.
I've confirmed that all 3 datastores have Storage I/O Control disabled.
Any idea what's causing this?
Thanks in advance!
there are few things which are possibly writing.
- VDS ( to write .dvsdata )
- HA - when you have configured datastore for heartbeat.
- scratch partitions - if you configured them.
it is possible that i have missed something.
==========================
VirtualRay: Thanks for the quick response!
VDS: I assume you mean vSphere Distributed Switch....I'm not using that.
HA: Not in use since I've only got a single host.
Scratch partitions: I configured a scratch directory on one of the datastores but not the other 2.
Any other ideas?
it is probably the VMFS heartbeat from the host. The host updates the heartbeat metadata region every 3 or 5 seconds (from the top of my head), so you will see this activity.
It's gonna be ATS Heartbeating VMware Knowledge Base
However where do you check small writes?
Can you check them from esxtop then type 'u', expand one of storage devices and see which process generates write\reads
Good read by the way on heartbeating: https://cormachogan.com/2017/08/24/ats-miscompare-revisited-vsphere-6-5/
Thanks depping & Finikiez! Some quick follow-ups:
1) Interestingly enough, this morning I'm seeing the activity continue on 2 of the datastores like clockwork but much more rarely on the 3rd datastore (the only one that's all-SSD). I'm not sure why.
2) I am monitoring the disk usage from FreeNAS using "zpool iostat -v 1" (with all VMs and VCSA paused). I can also hear the activity as the 2 HDD arrays make a subtle but distinctive rat-tat-tat pattern.
3) With regards to esxtop:
a) I was intrigued about using esxtop to see which process is generating the writes but couldn't find a way to do this. The esxtop "u" view only shows devices and none of the sub-commands appears to show process info. Am I missing something here?
b) I did see in the "u" view some activity under CMDS/s (but not WRITE/s) on the 2 datastores that seems to correlate with the activity I'm seeing on the storage side.
c) The SSD datastore is strange in that I rarely see any of the esxtop "u" mode stats reading anything other than 0.00 even when FreeNAS is showing some activity there.
4) It appears my storage is fully VAAI capable. When I run "esxcli storage core device vaai status get -d naa.xxxxxxxxxxxxxxxxxxxxxxxxxx" on my 3 devices, I see the following:
VAAI Plugin Name:
ATS Status: supported
Clone Status: supported
Zero Status: supported
Delete Status: supported
5) I read up on VMFS heartbeating and ATS heartbeating. It looks like the latter is a more enhanced version of the former? Anyway, when I run "esxcli storage vmfs lockmode list", it shows all of my datstores using the ATS locking mode.
6) Because #4 and #5, it seems clear that ATS heartbeating is enabled. However, since I use Veeam to back up the VMs and since it accesses the datastores directly over iSCSI, I'm guessing disabling the ATS heartbeat would be bad, right?
3) With regards to esxtop:
a) I was intrigued about using esxtop to see which process is generating the writes but couldn't find a way to do this. The esxtop "u" view only shows devices and none of the sub-commands appears to show process info. Am I missing something here?
b) I did see in the "u" view some activity under CMDS/s (but not WRITE/s) on the 2 datastores that seems to correlate with the activity I'm seeing on the storage side.
c) The SSD datastore is strange in that I rarely see any of the esxtop "u" mode stats reading anything other than 0.00 even when FreeNAS is showing some activity there.
When you are on disk screen ('u) then press 'e' to expand and type naa.id (or copy\paste it). This will expand statiscis for the disk. You will see mutiple lines coresponding to process (column path\world..)
6) Because #4 and #5, it seems clear that ATS heartbeating is enabled. However, since I use Veeam to back up the VMs and since it accesses the datastores directly over iSCSI, I'm guessing disabling the ATS heartbeat would be bad, right?
You can disable ATS heartbeating only and check further.
Thanks Finikiez....I didn't realize that world IDs are like process IDs but now I get it.
Unfortunately (or fortunately depending on your perspective) the behavior has now stopped on all datastores. Next time I see it, I'll try to dig deeper with esxtop.
With regards to disabling ATS heartbeating, is this safe to do in a single node scenario where the datastore can also be read directly by Veeam?
ATS heartbeating helps to check liveness of heartbeat to datastore and doesn't affect veeam backup.
So you can disable and check this safely from my perspective. I didn't do this by myself on 6.5, however I did this many times on previous ESXi versions as there was very well known issue describe in KB article. Storage arrays couldn't process so many ATS commands.
Check Cormac's article which Duncan Epping posted earlier. It describes ATS heartbeating very well.