Good morning. I have a support ticket open with VMware: 15834814212
I was doing some routine host maintenance so I was using vMotion to evacuate VMs. I got an error on one particular host:
Migrate virtual machine GVMon Failed waiting for data. Error 195887167. Connection closed by remote host, possibly due to timeout.
Most information for the error points to a vMotion networking problem, but since the other VMs migrated w/o issues, I decided this was not the cause.
I powered the VM down and was able to move it to another host. However, when powering it on I got this error:
Power On virtual machine GVMon File GVvCenter_2.vmdk was not found.
Sure enough, when browsing the datastore, that file does not exist.
Support says that the file has been missing for a few weeks,crazy that the VM continued to run. They do not have an explanation for why it happened. No orphaned objects were found and all troubleshooting points to everything being OK. I'm told that the .vmdk descriptor file needs to be re-created, but that would require a UUID that is not available because the .vmdk descriptor file is not accessible. Catch 22!
Any ideas? Thank you, Zach.
Support got the .vmdk file back and the VM is running like normal. Thank you Brian from Ireland!
I managed to retrieve the UUID of the disk from the vmware logs of the VM:
- vmware-1.log:2015-11-10T13:24:36.609Z| vmx| I120: DISKLIB-VMFS : "vsan://f5c26155-5e58-a555-bc8d-ecf4bbcfca10" : open successful (21) size = 75161927680, hd = 0. Type 3
- vmware-1.log:2015-11-10T13:24:36.612Z| vmx| I120: DISKLIB-VMFS : "vsan://f5c26155-5e58-a555-bc8d-ecf4bbcfca10" : closed.
We clarified that this was the correct UUID with the following command and associated output:
/usr/lib/vmware/osfs/bin/objtool getAttr -u f5c26155-5e58-a555-bc8d-ecf4bbcfca10
Object Attributes --
UUID:f5c26155-5e58-a555-bc8d-ecf4bbcfca10
Object type:vsan
Object size:75161927680
User friendly name:(null)
HA metadata:(null)
Allocation type:Zeroed thick
Policy:((\"stripeWidth\" i1) (\"cacheReservation\" i0) (\"proportionalCapacity\" i0) (\"hostFailuresToTolerate\" i1) (\"forceProvisioning\" i0) (\"spbmProfileId\" \"aa6d5a82-1c88-45da-85d3-3d74b91a5bad\") (\"spbmProfileGenerationNumber\" l+0))
Object class: vdisk
Object path: /vmfs/volumes/vsan:52ce5c856108f1cb-fffcad0808c892b3/f1c26155-5ae8-5013-c1fb-ecf4bbcfca10/GVvCenter_2.vmdk
We then created a temp VM and copied the temp.vmdk to the VM directory.
I edited the newly created GVvCenter_2.vmdk so that it contained the following:
# Extent description
RW 146800640 VMFS "vsan://f5c26155-5e58-a555-bc8d-ecf4bbcfca10"
The RW value was calculated by dividing the size above (75161927680) by 512
I got the VMID of the VM by running:
vim-cmd vmsvc/getallvms
Then reloaded the .vmx file by running:
vim-cmd vmsvc/reload <VMID>
Since the .vmdk descriptor file is loaded into memory when the VM is poerwed on, it's possible that a missing descriptor file doesn't cause any issues.
Anyway, I can't tell for sure whether vSAN requires a special UUID in the descriptor file. Please attach a sample .vmdk descriptor file (e.g. GVvCenter_1.vmdk) form a vSAN datastore to see how this looks like, and also provide the exact size (in Bytes) of the "GVvCenter_2-flat.vmdk" file, i.e. the output of ls -lisa for this file.
André
I have attached an example of an existing .vmdk file.
RW 524288000 VMFS "vsan://46e8b055-1939-1f98-5c93-ecf4bbd70598"
I was told the highlighted above is the UUID needed to re-create the .vmdk, but the only place it exists is in the .vmdk file. That's the Catch 22.
My understanding is that there are no flat files in vSAN. The .vmdk file is the pointer to all the objects where the data is stored. The output of an ls -lisa is below. Thank you, Zach.
total 802824
4 1024 drwxr-xr-t 1 root root 2940 Dec 17 10:32 .
73686607611040 0 drwxr-xr-x 1 root root 512 Dec 17 18:15 ..
8452 0 -rw------- 1 root root 0 May 24 2015 .f5c26155-5e58-a555-bc8d-ecf4bbcfca10.lck
4194308 2048 -r-------- 1 root root 1441792 May 24 2015 .fbb.sf
8388612 261120 -r-------- 1 root root 267026432 May 24 2015 .fdc.sf
25165828 2048 -r-------- 1 root root 1179648 May 24 2015 .pb2.sf
12582916 262144 -r-------- 1 root root 268435456 May 24 2015 .pbc.sf
16777220 257024 -r-------- 1 root root 262733824 May 24 2015 .sbc.sf
29360132 1024 drwx------ 1 root root 280 May 24 2015 .sdd.sf
20971524 4096 -r-------- 1 root root 4194304 May 24 2015 .vh.sf
33559172 0 -rw-r--r-- 1 root root 100 Dec 16 18:37 GVvCenter-dfce8268.hlog
8397060 1024 -rw------- 1 root root 8684 Dec 16 18:51 GVvCenter.nvram
562045188 0 -rw-r--r-- 1 root root 0 Dec 16 18:35 GVvCenter.vmsd
41951492 8 -rwxr-xr-x 1 root root 4033 Dec 16 18:51 GVvCenter.vmx
29364868 5120 -rw------- 1 root root 4588032 Dec 16 18:51 GVvCenter_2-ctk.vmdk
448795268 1024 -rw-r--r-- 1 root root 786707 Dec 16 18:51 vmware-1.log
578822404 1024 -rw-r--r-- 1 root root 97452 Dec 16 18:35 vmware-2.log
595599620 1024 -rw-r--r-- 1 root root 97450 Dec 16 18:37 vmware-3.log
612376836 1024 -rw-r--r-- 1 root root 96112 Dec 16 18:51 vmware-4.log
629154052 1024 -rw-r--r-- 1 root root 95880 Dec 16 18:52 vmware-5.log
645931268 1024 -rw-r--r-- 1 root root 95996 Dec 17 10:32 vmware.log
Sorry, my bad, in vSAN the vmdk is an object. Unfortunately I don't have access to a vSAN environment right now, but it may be worth checking out http://www.virten.net/2014/01/manage-vsan-with-rvc-part-4-troubleshooting/ which shows how to query the vSAN objects using the Ruby vSphere Console. I didn't use this myself yet, but it looks like it could be helpful in your case.
André
Good morning, we ran through most of that troubleshooting with support. Unfortunately, nothing shows up as wrong. The .vmdk is just gone. Thank you, Zach.
Support got the .vmdk file back and the VM is running like normal. Thank you Brian from Ireland!
I managed to retrieve the UUID of the disk from the vmware logs of the VM:
- vmware-1.log:2015-11-10T13:24:36.609Z| vmx| I120: DISKLIB-VMFS : "vsan://f5c26155-5e58-a555-bc8d-ecf4bbcfca10" : open successful (21) size = 75161927680, hd = 0. Type 3
- vmware-1.log:2015-11-10T13:24:36.612Z| vmx| I120: DISKLIB-VMFS : "vsan://f5c26155-5e58-a555-bc8d-ecf4bbcfca10" : closed.
We clarified that this was the correct UUID with the following command and associated output:
/usr/lib/vmware/osfs/bin/objtool getAttr -u f5c26155-5e58-a555-bc8d-ecf4bbcfca10
Object Attributes --
UUID:f5c26155-5e58-a555-bc8d-ecf4bbcfca10
Object type:vsan
Object size:75161927680
User friendly name:(null)
HA metadata:(null)
Allocation type:Zeroed thick
Policy:((\"stripeWidth\" i1) (\"cacheReservation\" i0) (\"proportionalCapacity\" i0) (\"hostFailuresToTolerate\" i1) (\"forceProvisioning\" i0) (\"spbmProfileId\" \"aa6d5a82-1c88-45da-85d3-3d74b91a5bad\") (\"spbmProfileGenerationNumber\" l+0))
Object class: vdisk
Object path: /vmfs/volumes/vsan:52ce5c856108f1cb-fffcad0808c892b3/f1c26155-5ae8-5013-c1fb-ecf4bbcfca10/GVvCenter_2.vmdk
We then created a temp VM and copied the temp.vmdk to the VM directory.
I edited the newly created GVvCenter_2.vmdk so that it contained the following:
# Extent description
RW 146800640 VMFS "vsan://f5c26155-5e58-a555-bc8d-ecf4bbcfca10"
The RW value was calculated by dividing the size above (75161927680) by 512
I got the VMID of the VM by running:
vim-cmd vmsvc/getallvms
Then reloaded the .vmx file by running:
vim-cmd vmsvc/reload <VMID>