Ran into an odd issue cloning a VM to an NFS datastore, figured I'd float it out here for the masses and see if anyone else has encountered a similar situation.
- Running ESXi 4.1 u1 with a 10Gbps access to iSCSI volumes and one 1Gbps link to an NFS mount.
- The NFS datastore is in a physical Windows Server 2008 R2 host that's sharing out it's disks via file services for windows.
The link between these two is a 1Gbps link.
I've got a powershell script I've written that copies VMs to the NFS share for backup purposes. Worked fine, we've been using this as a backup strategy for months. No problems until a few weeks ago.
We added new storage to allow more VMs being run on the host. So the changes were 1) increased storage to the iSCSI volumes and 2) more VMs.
When I clone a 10GB VM from local to the NFS mount, works fine. I can get that clone done in 2.5 minutes, takes up about 50% of the network link and I can see the activity going crazy in the form of data moving from the ESXi host to the Windows Host - task manager on the windows side shows me a 50% utilization on the NIC to the NFS share. This is exactly how it used to work for months and is the behavior I expect.
When I clone a 50GB VM from local to the NFS mount, the bottom drops out. Task manager shows a <2.5% load on the link, data seems to be coming from the ESXi host INCREDIBLY slowly. The problem seems to be focused uniquely on just the recent addition VMs.
I've never seen where a cloning error is unique to some VMs but not others. I can svMotion the problem VMs from iSCSI volume to iSCSI volume just fine. I can clone the problem VM from iSCSI to iSCSI just fine. It just seems to be unique to the NFS volume that my ability to write the clone seems to be failing, but it's not consistent across VMs.
It's like we've hit some kind of performance threshold. We used to be able to clone 50GB VMs just fine, but no longer can.
I'm at a loss. If anyone has any insight to offer, I'd be most appreciative.
thanks for your time.