VMware Cloud Community
FarazR
Contributor
Contributor

Can not deploy from template -- error caused by file

Hello all,

I have a fairly simple setup with a 4 EXS servers connected to same NFS datastore on a NetApp storage. I have a debian template that I am deploying virtual machines from accross all hosts. The problem is all hosts get the VMs deployed normally without any problems except for one...lets call it host A. On host A when I try to deploy the VM from template, it takes a long time and it finally errors out with follwoing message

Note: This all hosts are 4.0 and are being managed from vSphere server

"Error caused by file Linux90/Linux90.vmdk"

Couple of observations:

  • Thie file in question Linux90.vmdk actually belongs to the template

  • I try to monitor the progress of VM deployment by observing my /vmfs/volume/.../ directory, I can see that files are being created for new VM however something goes wrong towards the end of the procedure that causes deployment to fail

What files can I look at in order to find the possible cause of this failure? Has anyone seen this before? Please note that there are other hosts where template is being deployed without any problems so I suspect that problem is local to hostA

Reply
0 Kudos
11 Replies
admin
Immortal
Immortal

Here's a good KB article - 1004050 Troubleshooting template deployment or cloning when it fails

Rick Blythe

Social Media Specialist

VMware Inc.

Reply
0 Kudos
FarazR
Contributor
Contributor

Thanks for the reply

but I am begining to think that the issue is purely ESX issue with nothing to do with Template deployments, since I am having problem creating VMs from scratch on this particular host

Reply
0 Kudos
mrsolo
Contributor
Contributor

I have the same error while trying to create a new VM on an NFS datastore on NetApp.

The NFS datastore is created OK and you can browse it.

I tried but failed to create a new VM on this NFS datastore. The error is

"Error caused by file. /vmfs/volumes/xxxxxxxx/New Virtual Machine

Reply
0 Kudos
FarazR
Contributor
Contributor

How is your ESX performance in general? Have you tried deploying a VM to local storage (assuming one is available)? I think my issue is closer to yours, because I too tried to deploy a VM directly on my NFS datastore directly without using a template and got the same error...more so the performance on this ESX box is horrible even though the resources (4GB RAM, 2G CPU) aren't tied up at all. I suspect a bad NiC and trying to troubleshoot the problem now...

Reply
0 Kudos
mrsolo
Contributor
Contributor

I'm just doing some evaluation of NFS datastore with different NAS boxes (SUN 7000 is the other) and not yet concerned about performance at this time.

Reply
0 Kudos
admin
Immortal
Immortal

Do you see any NFS related messages in the vmkernel logs ?

Jeremy

Reply
0 Kudos
mrsolo
Contributor
Contributor

I'm not familiar with read ESX log. These 2 message below are the only things that seem relevant to what I was doing.

Jul 8 09:03:53 fox vmkernel: 10:23:19:51.718 cpu1:4105)NFS: 107: Command: (mount) Server: (fox) IP: (10.80.89.4) Path: (/vol/datavol) Label: (nas01nfs) Options: (None)

..

Jul 8 09:04:23 fox vmkernel: 10:23:20:21.966 cpu0:4105)WARNING: NFS: 898: RPC error 13 (RPC was aborted due to timeout) trying to get port for Mount Program (100005) Version (3) Protocol (TCP) on Server (10.80.89.4)

Reply
0 Kudos
mrsolo
Contributor
Contributor

Turned out it's the NFS/volume permission issue.

Even though the NFS permissions looked right from the NetApp GUI i.e R/W access to all hosts, root access to ESX 4.0 and another Solaris 10 (Sparc) box for my quick NFS admin stuffs.

Somehow the the NFS volume was created with 555 permissions. After I did a 'chmod 777' (from my Solaris box), new VM can be created on it from the vSphere 4.0 client. I don't think I should have to do this but...

Reply
0 Kudos
UW-MattW
Contributor
Contributor

In my case, I got this error message when I tried to copy a VM from an iSCSI target on sun storage 7110 down

to the local disk on my ESX4 server... I looked at a lot of complex factors, but it turns out I didnt have enough

local disk space. Duh. Smiley Happy Posting this in the hopes it saves

someone time wasted looking for a more complicated explanation.... simplest is best. Smiley Happy

-Matt

Reply
0 Kudos
sums611
Contributor
Contributor

Thanks, and kudos to you, found your own way out, Was facing the same isue and your post for the exact solution i was hunting for, thanks.

Reply
0 Kudos
kevlev201110141
Contributor
Contributor

I was using iSCSI storage and had to click "Rescan All" Under Configuration -> Storage within the vSphere host configuration screen and it works fine now.

Reply
0 Kudos