VMware Cloud Community
Cynomus
Contributor
Contributor

ESXi 4.1 on NetApp NFS 0 byte vmx files

Hi All

Running an HA/DRS/VMotion/SVMotion ESXi4.1 cluster with 10 host servers and about 300 guests. Storage is on private 10G across 2 NetApp controllers.

Recently had a situation where one host wrote 0 byte vmx files for all the guest VMs(about 43 of them), and as a result they all eventually hung and failed. The host went to 100% CPU and stayed until we pulled the physical plug.

Prior to power down, we SSHed in and found that we could:

cd /vmfs/volumes/<sharename x>/<machine name xx>

touch testfile

ls -l

     (revealed the new file 0 bytes)

vi testfile

     (entered gibberish text and saved file, exit vi)

ls -l

     (file size still 0 bytes)

So we could create a file but not edit it.

All other host servers could access the same NFS datastore with proper read/write access, across both NetApp storage systems, so the problem apprears to be limited to this one server. After a reboot, and retoring the vmx files from snapshots, all the systems returned to service and the host can communicate properly with the NFS datastores again.

Anyone else ever see anything like this? VMware support claims they have never seen it. We were formerly all FC and never had a problem like this, so we have a bad taste with NFS, after running VMware on FC for 8 years.

Reply
0 Kudos
0 Replies