VMware Cloud Community
vSeanTHereisNoW
Contributor
Contributor

.vmware_ha folder on non-heartbeat datastores

I'm noticing the .vmware_ha folder structure being created on datastores that are not configured as preferred HA heartbeat datastores.

Since this is a test environment I end up adding and removing datastores often. It appears every time I add a datastore it gets this folder. This in turn results in the error "vSphere HA agent on host 'x' failed to quiesce file activity on datastore" message every time I try to dismount the datastore.

Is there a reason this folder is being created on non-perferred datastores? Is there a way to prevent this? What is the correct procedure for removing it so I can cleanly remove my datastore?

Thanks!

Tags (4)
Reply
0 Kudos
24 Replies
vMario156
Expert
Expert

There is much more than the heartbeat datastores going on. So it is normal you will find it on every datastore in an HA enabled cluster.

The folder contains lists for example which host has powered on which VM etc.

Just take a look into it Smiley Happy

Regards,

Mario

Blog: http://vKnowledge.net
Reply
0 Kudos
vmroyale
Immortal
Immortal

Note: Discussion successfully moved from VMware ESXi 5 to Availability: HA & FT

every time I try to dismount the datastore.

What steps (or how) are you performing this?

Brian Atkinson | vExpert | VMTN Moderator | Author of "VCP5-DCV VMware Certified Professional-Data Center Virtualization on vSphere 5.5 Study Guide: VCP-550" | @vmroyale | http://vmroyale.com
Reply
0 Kudos
vSeanTHereisNoW
Contributor
Contributor

Thanks vMario. I know what the folder is for. My question is, what is the correct way to get HA to not use this datastore so that I can dismount it without errors.

Reply
0 Kudos
vSeanTHereisNoW
Contributor
Contributor

vmroyale, I see this error both when right-clicking on the datastore and selecting dismount in the vSphere client as well as using the SDK for some of my automation when calling host.getHostStorageSystem().unmountVmfsVolume(uuid).

I've seen the recommended storage removal steps:

  • Unregister all objects from the datastore including VMs and Templates
  • Ensure that no 3rd party tools are accessing the datastore
  • Ensure that no vSphere features, such as Storage I/O Control or Storage DRS, are using the device
  • Detach the device from the ESX host; this will also initiate an unmount operation
  • Physically unpresent the LUN from the ESX host using the appropriate array tools
  • Rescan the SAN
  • I've also seen recommendations to check the cluster HA settings and remove the datastore from the preferred list. In this cause though, the cluster HA settings do say to only use preferred datastores for datastore heartbeating and the datastore in question is not in the list as one of the preferred datastores.

    Reply
    0 Kudos
    vmroyale
    Immortal
    Immortal

    The .vSphere-HA directory is used by HA for things other than datstore heartbeating, so its presence is expected on all datastores. This is normal. Check out depping 's blog entry http://www.yellow-bricks.com/2012/11/23/what-is-that-poweron-file-in-my-vsphere-ha-folder/ for one of the uses of this directory.

    http://kb.vmware.com/kb/1033634 has possible solutions for this error, but definitely follow the correct procedure for removing LUNs/storage devices you listed earlier as well.

    I guess you could always disable HA and then remove the datastores, even though that is more work...
    Brian Atkinson | vExpert | VMTN Moderator | Author of "VCP5-DCV VMware Certified Professional-Data Center Virtualization on vSphere 5.5 Study Guide: VCP-550" | @vmroyale | http://vmroyale.com
    Reply
    0 Kudos
    depping
    Leadership
    Leadership

    HA uses this folder for various other reasons, for instance to track which virtual machines were powered on / off. So you cannot force HA not to use it. If this causes trouble you will need to disable HA before you take the unmount action.

    Reply
    0 Kudos
    SeanMcGinnis
    Contributor
    Contributor

    Thank you Duncan! Just to be sure I understand fully, the only way to get rid of the "failed to quiesce file activity on datastore" error message would be to disable HA completely, dismount the datastore, then reenable HA, correct?  Thanks!

    Reply
    0 Kudos
    admin
    Immortal
    Immortal

    HA should quiesce IO to the datastore without being disabled so seems like there is a bug here. Appreciate if you could file an SR so we can take a look at the logs and figure out what went wrong in your environment.

    Thanks,

    Elisha

    Reply
    0 Kudos
    CorporIT
    Contributor
    Contributor

    We are getting the same message when trying to unmount a NFS export on a NAS that has failed. Hopefully there is a fix for this.

    Thanks.

    Reply
    0 Kudos
    depping
    Leadership
    Leadership

    CorporIT wrote:

    We are getting the same message when trying to unmount a NFS export on a NAS that has failed. Hopefully there is a fix for this.

    Thanks.

    As Elisha mentions above, please file a bug if you have errors like these. That way engineering can research it and potentially solve it. I mount/unmount datastores regularly and haven't hit this yet...

    Reply
    0 Kudos
    jchampton
    Contributor
    Contributor

    I was having this same issue and I found the following post:

    http://adminotes.blogspot.com/2012/11/tips-vmware-cluster-remove-datastore.html

    I had been trying to remove the datastore via "Inventory->Datastores and Datastore clusters" (popped error for both hosts to which the datastore is attached) and also through "Inventory->Hosts and Clusters->Configuraiton->Storage" for the individual hosts.  I tried the unmounting from the "summary" tab per the article above, and it worked.  The interesting part was that I tried using the summary tab for the second host, and it failed with the error again. I then went to the Configuration->Storage page and tried removing it from there, and it worked.  Weird

    (Sorry for not logging a support incident, vmware)

    Reply
    0 Kudos
    wallbreakr
    Contributor
    Contributor

    same problem here. vSphere 5.1 and... ...SRM.

    So my LUN replicated on the spare datacenter has .vSphere_HA folder with data from the primary datacenter.

    I don't find any configuration in HA to change things.

    as you can see in the following picture there's only two datastore selected.

    but all not selected but monted on the host of my cluster.

    ha cluster settings.png

    Is this a bug ?

    Reply
    0 Kudos
    depping
    Leadership
    Leadership

    This is standard behavior by HA and as design. As mentioned above the activity should be quiesced, so you shouldn't have any problems.

    Can I ask you to file a bug with the exact issues you are experiencing and post the SR number here so I can have a look and point the right engineers at this problem?

    thanks,

    Reply
    0 Kudos
    ddio
    Contributor
    Contributor

    Hi Duncan,

    I have opened an SR for this as we have also seen this multiple times since upgrading to vCenter 5.1.  There are screen shots and VMware support bundles attached with this ticket.  Resolution to this issue will be much appreciated.  This issue is highly reproducible since uprgading from vCenter 5.0 -> vCenter 5.1 and ESXi 5.1U1 and 5.0U2.

    Thanks

    Support Request Confirmation Number: 13339664806

    Support Request Status: Open (Inbound message received)

    Date and Time Created: 2013-06-21 09:23

    Target Response Time: 2013-06-24 09:23 (GMT-08:00) Pacific Standard Time

    Configuration:

    Two ESXI hosts in a VMware cluster with HA enabled

    vCenter 5.1

    ESXi 5.1U1, 5.0U2

    Summary:

    Unmounting an inactive NFS datastore results in Vmware returning an error: ‘failed to quiesce file activity on datastore’

    • This behavior has been seen since upgrading to vCenter 5.1.
    • This behavior was not seen in the same configuration under vCenter 5.0

    While attempting to unmount an inactive datastore in the “Configutation -> storage” pane the following error is thrown.  Subsequent attempts can be successful,

    ERROR:

    Call "HostDatastoreSystem.RemoveDatastore" for object "datastoreSystem-76" on vCenter Server "VC-129-50-27.labx.simplivt.local" failed.

    Remove datastore

    ds1

    The vSphere HA agent on host '10.129.3.110' failed to quiesce file activity

    on datastore '/vmfs/volumes/c533f62f-7afa7af7'. To proceed with the

    operation to unmount or remove a datastore, ensure that the datastore is

    accessible, the host is reachable and its vSphere HA agent is running.

    LABX\Administrator

    6/21/2013 11:52:21 AM

    6/21/2013 11:52:21 AM

    6/21/2013 11:52:22 AM

    Reply
    0 Kudos
    depping
    Leadership
    Leadership

    I have tried to reproduce the issue by the way, so far not successful.

    Reply
    0 Kudos
    ddio
    Contributor
    Contributor

    Thanks, every time I have seen this behavior it is attempting unmount an NFS datastore that has just gone inactive under vCenter 5.1.  It seems to happen quite frequently and has been seen in multiple configurations.

    Reply
    0 Kudos
    ddio
    Contributor
    Contributor

    Updated SR ticket:  Known issue with related catch all KB article.

    I am surprised that this has been uncovered with more users adopting 5.1.  We will use "Reconfigure for vSphere HA" or perform multiple attempts when encountering this behavior.

    Thanks

    ** Please do not change the subject line of this email if you wish to respond. **

    Hello Darryl,

    As discussed over call this is a known behavior in vCenter Server 4.1.x, 5.0.x and 5.1.x.

    I am sending you the Kb article

    vSphere HA and FT Error Messages (1033634)

    http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=103363...

    Since this is a known issue hopefully We are planning to get rid of this issue with the upcoming version of V center server we are still working on it.

    I am going to close the case on temporary note In case you need help please reply to this email within twenty one days from now and I will schedule another call for you.

    Reply
    0 Kudos
    depping
    Leadership
    Leadership

    I've have reported this to engineering, they will take a look at it ddio. May I suggest making a dump of your log files after experiencing the issue (support will request this soon I am guessing anyway)

    Reply
    0 Kudos
    ddio
    Contributor
    Contributor

    Thank you,

    It sounds like this SR will be closed as a known issue based on the KB referenced above.  I did provide screenshots along with system logs and uploaded them to the SR, I followed the outlined procedure for collecting diagnostic information needed for technical support.

    Reply
    0 Kudos