VMware Cloud Community
jpreou
Contributor
Contributor

Delete missing vmdk file?

I had a power failure at home the other day so my test ESX environment (all one server of it) went down suddenly when the UPS battery went flat (no agent, another story , another time). Anyway...

When it came back up one of the servers failed to load because of a missing VMDK file (the config file, not the flat file). Long story short; moved the flat file, renamed the original folder, removed the machine from VirtualCenter, re-created the machine and replaced just the vmdk flat file and everything is back on track.

EXCEPT; I now have the renamed folder "MarshalVM.000" with no files in it. The trouble is, I cannot remove the folder because it isn't empty. When I do an "ls" command it says I have a file called "MarshalVM.vmdk" (yes, the 'lost' file) which does not exist! I can't rename the file or delete the file because the service console says it doesn't exist. Therefore I cannot delete the parent folder. How do I resolve this issue. Under Windows I'd be looking to do a CHKDSK or something like that. I see their is a linux command called "fsck" but it won't run directly from the VMFS volume and when run from the root of the Service Console it wants to do all volumes and complains they are mounted. I don't really know enough about it to proceed comfortably. I'm also not certain whether this command will hose the VMFS file system or not. Further googling suggests that it may not be the right answer. Any ideas?

0 Kudos
10 Replies
Dean_Holland
Enthusiast
Enthusiast

Have you tried rebooting the host or bringing it up in Service Console only mode so the VMkernel doesn't start?

It almost sounds like that file is still being used by a VM.

0 Kudos
jpreou
Contributor
Contributor

Reboot; yes. Start in Service Console mode; no. I don't believe the file is in use. I believe corruption of the vmfs file system causes ESX to believe the file exists when it does not. Hence why you cannot see it in a directory listing, but it still says the file exists but cannot be found, if you see what I mean (see below).

# ls ls: MarshalVM.vmdk: No such file or directory

Attempts to create a file with the same name fail with the error:-

# echo "test file" > MarshalVM.vmdk -bash: MarshalVM.vmdk: File exists

Attempts to delete the file fail with the error:-

# rm MarshalVM.vmdk rm: cannot lstat `MarshalVM.vmdk': No such file or directory

0 Kudos
Lightbulb
Virtuoso
Virtuoso

So you have a ghost file the is in a directory on your VMFS filesystem correct. This file is in a directory that you cannot remove, because the file is in that directory, yes?

Have to you tried rm -rf directoryname where directoryname is the name of the directory where the ghost file is located? This command forces removal of a directory and all it's contents.

0 Kudos
jpreou
Contributor
Contributor

You have the situation exactly correct, but I'm afraid your solution did not work:-

# rm -rf MarshalVM.old rm: cannot remove directory `MarshalVM.old': Directory not empty

0 Kudos
Lightbulb
Virtuoso
Virtuoso

What are the results of

ls -la directoryname

Have you tried

chown -R root directoryname

chgrp -R root directoryname

then

rm -rf directoryname

0 Kudos
Lightbulb
Virtuoso
Virtuoso

Just a lark you could try

vmkfstools -U /vmfs/volumes/<yourdatastorename>/MarshalVM.000/MarshalVM.vmdk

0 Kudos
jpreou
Contributor
Contributor

Hi Lightbulb. Thanks for your continued suggestions, but no further progress I'm afraid. Results were as follows in each case.

  1. ls -la MarshalVM.000

ls: MarshalVM.000/MarshalVM.vmdk: No such file or directory

total 1088

drwxr-xr-x 1 root root 420 Feb 16 21:20 .

drwxr-xr-t 1 root root 1400 Feb 16 21:28 ..

-


]# chown -R root MarshalVM.000

chown: failed to get attributes of `MarshalVM.000/MarshalVM.vmdk': No such file or directory

-


  1. chgrp -R root MarshalVM.000

chgrp: failed to get attributes of `MarshalVM.000/MarshalVM.vmdk': No such file or directory

-


  1. rm -rf MarshalVM.000

rm: cannot remove directory `MarshalVM.000': Directory not empty

-


  1. vmkfstools -U /vmfs/volumes/Disk1/MarshalVM.000/MarshalVM.vmdk

Failed to delete virtual disk: The system cannot find the file specified (25).

0 Kudos
Lightbulb
Virtuoso
Virtuoso

Well if this was a production system I would say lets get the VMs somewhere else and recreate the VMFS volume.

A previous poster indicated going into SC only mode. Which you could try but I think you may be right and there is something a little whacked on your datastore. Since I assume you have a single store you could use scp or vmware converter and get you VMs off, reinstall and then import the VMs back in.

Or you could play with it some more because that is a great way to learn.

Good luck and let me know how it comes out.

0 Kudos
jpreou
Contributor
Contributor

I have more than one store, so I could of course move the VMs, wipe and re-format. But that is the easy option and I prefer to try to understand and fix. After all, this may be a perfectly acceptable approach at home, but what if this was a customer system and they had no space to move VMs, or no vMotion and no outage windows, etc, etc. I'll sit on it for a while, see if I get any other ideas or responses then decide what to do. Thanks for your help so far.

0 Kudos
jai64
Contributor
Contributor

Did you ever figure this out. I have a simalar issue when a migration failed. I belive it was an iSCSI issue.

~ Joe

0 Kudos