tijz
Contributor
Contributor

Backup of VM's fail because .VMX file is locked

Hi all,

We have an environment of 5 ESXi hosts all connected to an iSCSI SAN.

we use Veeam Backup and Replication 6 to backup the VM's.

Most of the VM's get backed up ok, but some fail with an error that the .VMX file is locked.

The .VMX file is indeed locked. The VM is powered on. From vCenter, when I browse the datastore and try to download the .VMX file I get an error saying the file is locked.

VMX files from other VM's I'm able to download without problem.

When I power off the specific VM the .VMX file is unlocked. After power on the file is still unlocked.

But "after some time" the file is locked again and so the VM cannot be backed up. Why is this happening to some VM's and not to others?

thanks.

0 Kudos
19 Replies
BharatR
Hot Shot
Hot Shot

Hi,

Here are the steps in http://kb.vmware.com/kb/10051 to find out the MAC address of the host locking the file(s).

Best regards, BharatR--VCP4-Certification #: 79230, If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
tijz
Contributor
Contributor

Hi Bharat,

thanks for your reply, but I think you misunderstood my question.

i know how to unlock the VM, that's not the issue.

The issue is why is the VMX file locked in the first place? It locks again after a while so I cannot back it up.

Just a handfull of VM's have this problem. But why?

EDIT:

the locking of the VMX file itself doesn't pose any problems. When I powerdown the VM the file is unlocked.

VM's with a locked VMX file are running normally.

0 Kudos
Rubeck
Virtuoso
Virtuoso

The VMs which have this issue are they located on different VMFS datastores?

What VMFS version are you using? v5 or 3?

I really have no idea what the issue might be other than it seems that the VMFS meta data is playing tricks on you by adding locks where not needed..

/Rubeck

0 Kudos
tijz
Contributor
Contributor

Ok, update:

It seems the .VMX files are not actually locked.

When I connect directly to the host the VM is registered on, I am able to download the .VMX file to my desktop.

so it seems that the problem is with vCenter.

Our backup solution, Veeam Backup and Replication, also connects to vCenter, so this makes sense.

I assumed it was a locked file because of the error Veeam logs. (it states the file does not exist or is locked).

When I try to download the .VMX file from a datastore browser (using the vSphere client connected to vCenter) I get the following error:

Expected put message. Got: ERROR.

There is KB explaining this, but it doesn't offer a solution.

I need to know why some VMX files aren't accessible from vCenter and others are.

We use 4 Datastores, all new and fresh VMFS 5 (so not upgraded from VMFS3).

All datastores store both affected and non-affected VM's.

0 Kudos
colinmcadam
Contributor
Contributor

I am having the same issue with VEEAM connecting via vCenter - did you resolve your issue?

I have a support case open with VEEAM support so will see what they say

Cheers

Colin

0 Kudos
tijz
Contributor
Contributor

No im still having the same issue.

My post on the veeam forum never got approved.

0 Kudos
colinmcadam
Contributor
Contributor

Response from VEEAM support below.

The issue is infrastructure related.

Check the following KB's:

Using the vCenter Server datastore browser to download or copy a powered-on virtual machine's .vmx a...

http://www.veeam.com/kb_articles.html/KB1198/

For me, the issue turned out to be DNS/Default Route configured on my host

Colin

0 Kudos
tijz
Contributor
Contributor

Thank you for your pointers. I found this VMWare KB also, but as I said, it doesn't offer a solution.

In my environment I cannot find anything wrong with routing or DNS.

What I didn't notice before is that all affected VM's are running on the same host, let's call it ESX5.

As described in the KB "vCenter chooses an ESX host to service the download or copy operation. Problems occur if this ESX host is not the host on which the powered-on virtual machine is running".

That's exactly what happens, in my vpxd.log I see that a soap ticket is created for the ESX3 host, not for the ESX5.

But why? The KB doesn't say.

All my ESX hosts have the same network settings (gateway and DNS servers and all are reachable, I tested using SSH) ESX5 can resolve all other hosts en vCenter server.

0 Kudos
PepeFragoso
Contributor
Contributor

HI  Tijz,

I was having the same problem like you; When i was trying to backup from Veeam 6 in specific host, i wasnt able to download vmx file from vCenter, but i was able to download from the host directly.

I restarted vmware managment on affected host, i disconnected and reconnected the host, i restasted the vcenter service and it didnt fix the problem. I was using ESXi 4.1 version.

Right now, we are upgrading all our hosts from esxi 4.1 to  5.0 (Built 515841); as soon as i upgraded, i am not having that problem anymore. Veeam did all my backups from the problematic host without problems. Not sure if the reboot of the host fixed the problem (after the upgrade).

Definetely, we need to keep an eye on this problem because i received the error a few times the past two months. Did you figure out the problem Tijz?

Thanks

0 Kudos
tijz
Contributor
Contributor

Hi PepeFragoso,

We are allready running on ESXi 5 with all current patches.

However, we too seemed have to "solved" the problem. For me, recreating the entire backup job in Veeam seemed to have fixed it, for now.

I also had made a case with Veeam support, but they did not seem to understand the problem at all. (They thought it had something to do with credentials of guest OS ?!?!?). I closed the case for now as I'm not able to recreate the problem anymore.

And I really think it's a problem with VMWare and not so much with Veeam. But yes, we certainly have to keep an eye on this.

0 Kudos
BrendanMarmont
Enthusiast
Enthusiast

Hi, we are were also having this issue

*************************************************************************

After running a few tests, we are now sucessfully backing up via Veeam.

Our management server was not  presented any of the LUNs when it was originally configured. So I  created a test LUN and presented to hosts and the management server, created a couple of vms and the backup went through fine.

I was also able to install a virtualized vCenter 5 server and all jobs processed fine.

It makes sense that the management server needs to be ‘LUN aware’

**************************************************************************

"The problem with us only occurs when running the job via the management server, it runs fine when pointing it direct to the host, but that isn't going to help us if the vm is vMotioned else where."

Veeam works while directly connected to the host, and not through the VC then that would indicate something wrong with the VC and points the finger at VMware.

"3/20/2012 9:26:34 AM :: Error: Client error: File does not exist or locked. VMFS path: [[akl_l7] nzaklapp01/nzaklapp01.vmx].
Please, try to download specified file using connection to the ESX server where the VM registered.
Failed to create NFC download stream. NFC path: [nfc://conn:10.160.0.168,nfchost:host-36,stg:datastore-12779@nzaklapp01/nzaklapp01.vmx]."


Typically NFC issues could be 1 of 4 things or a combination thereof.

1. Permissions - Not the case as I can backup direct from host

2. Ports (902) - Are open under security profiles

3. DNS - /etc/resolv.conf has correct DNS addresses

4. VMware

Has anyone found a solution other than upgrading to v5 of the management server?

Thanks

Brendan

0 Kudos
JLFG
Contributor
Contributor

I just had this problem again in my enviroment. I am using ESXi 5.0, vcenter 5.0 and my backup solution is Veeam 6. I narrowed down this issue to specific datastores using the old VMFS-3 format version. I have around 20 Datastores in my enviroments and two were having issues to create backups.

The error that i was receiving from Veeam 6 was:  Error: Client error: Cannot get service content. Soap fault. TimeoutDetail: 'connect failed in tcp_connect()', endpoint: 'https://vcenter:443/sdk' SOAP connection is not available. Connection ID: [vcenter]. Failed to create NFC download stream. NFC path: [nfc://conn:vcenter,nfchost:host-16105,stg:datastore-19571@servertobackup/servertobackup.vmx]. I tried to download files from vcenter and i was receiving an error "vmware Expected put message got error".

I checked all my datastores and i narrowed down the problem to two datastores.When I connect directly to the host, then i tried to open the affected datastore, I received a bunch of windows errors: "Unable to read data from transport connections. The connection was closed" and "The request failed because the server 'hostserver.domain.com' closed the connection". I dont know how, after i clicked ok in one datastore, i was able to download files from vcenter, but the second one it didnt. This wednesday, i will reformat the datastore from vmfs3 to vmfs5 and i am pretty sure that will fix the problem. I think there is a problem with the datastore, but i am not sure how the issue started.

I'll keep you post it what happend. I know i can fix the problem moving to a different datastore, but we are in the process or updating our datastore to vmfs-5.

Pepe F. VCP
0 Kudos
nielse
Expert
Expert

How does the hosts file on the Veeam server looks like?

I have seen this error due DNS problems with the wrong host file on the Veeam server itself.

@nielsengelen - http://foonet.be - VCP4/5
0 Kudos
JLFG
Contributor
Contributor

Hi everyone,

I checked the DNS from my hosts (3) and one was pointing to my main dns server and the other two were pointing two the backup DNS server. As soon as i changed the DNS from backup server to Main server, Veeam backup didnt complaint and it did the backup without problems.

Why this is happen? Is it not suppose to work with any DNS Server? My Backup DNS servers are replicate from the main DNS Server.

thanks!

Pepe F. VCP
0 Kudos
huyhuyt
Contributor
Contributor

I found VMware KB explain for this issue download vmx file. As my understaing, this issue occurs under situation:

- Shared storage for multiple ESX.

- VM on shared storage

When download vmx file, vCenter have different path way to get vmx file through different ESX, so if one ESX hold lock and vCenter use different path ESX, the issue will occur.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=101928...

0 Kudos
JesperAnd
Contributor
Contributor

Same issue here, turned out to be the hosts file on the Veeam server had wrong entry for esxi DNS names.

Corrected the hosts file and reran backup with succes 🙂

0 Kudos
resteves
Enthusiast
Enthusiast

huyhuyt wrote:

I found VMware KB explain for this issue download vmx file. As my understaing, this issue occurs under situation:

- Shared storage for multiple ESX.

- VM on shared storage

When download vmx file, vCenter have different path way to get vmx file through different ESX, so if one ESX hold lock and vCenter use different path ESX, the issue will occur.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=101928...

I've got the same issue, but this only started after i updated 2 of my hosts from 5.5 to 5.5U1 and vCenter from 5.5 to 5.5U1, and is only happening for some VMs, i can't backup them with VDP, fails because VDP uses vCenter to download de .vmx file and it fails.

Basically, i can't backup any VM that is hosted on the hosts that i updated to 5.5U1 Smiley Sad

I can download the .vmx file of VMs with problem if i connect directly to the host that has those VMs, but if i try connected to vCenter it fails.

Can't find any problem with DNS or ROUTES.

Anyone has any idea?

0 Kudos
resteves
Enthusiast
Enthusiast

In my case, removing the updated hosts from vcenter and adding them again fixed the issue, VDP now backups the VMs without problems.

0 Kudos
JJAA
Contributor
Contributor

Problem was solved here by upgrading vCenter to the last update.

0 Kudos