VMware Cloud Community
Vacadeluna
Contributor
Contributor

Data Store missing/RAID Issues

We have a host that lost a drive on the RIAD. I was unable to get a new drive to sync up, so I moved what I could off of the host. I was 90% successful in doing so, how ever, 2 VMs are stuck on there and always fail a couple of minutes of trying to migrate them. I have tried using the converter and everything to try and get it to ignore the error with no luck. At this point, I cant even get the DS to show back up, and when trying to re-add it, it sits for a long time and nothing ever happens. Are these 2 VMs hosed and we need to start over, or is there some kind of magical genius out there that can help me save them???

We are running ESXi 6.0.0-2494585 Standard, and if you need anymore information please let me know!

Reply
0 Kudos
21 Replies
a_p_
Leadership
Leadership

The only "magical genius" here on VMTN is continuum​. 

Try to contact him via Skype (see his profile).


André

Reply
0 Kudos
Vacadeluna
Contributor
Contributor

Thank you, I tried calling him, but it looks like he may be in another country and because of the time difference I believe he may be asleep or something along those lines.

Reply
0 Kudos
a_p_
Leadership
Leadership

It ~7:30 PM where he lives, and from what I know, he rarely goes to sleep before midnight 😉


André

Reply
0 Kudos
Vacadeluna
Contributor
Contributor

OK, I shot him a PM on skype as well, so we shall see!

Reply
0 Kudos
continuum
Immortal
Immortal

Read Create a VMFS-Header-dump using an ESXi-Host in production | VM-Sickbay

with a dump like that I may be able to extract your VMs


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Vacadeluna
Contributor
Contributor

I think this thing is hosed...that file I generated is 4MB and is not as big as the site suggests it should be. I have included the file though, so if you want take a look feel free.

Reply
0 Kudos
continuum
Immortal
Immortal

Thats not a dump of the VMFS-volume.

Check the dd command - the partitionnumber you used was wrong


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Vacadeluna
Contributor
Contributor

This file is much larger around 250MB, so lets see if I got it right this time!

https://drive.google.com/open?id=1mC5h-ofTpUGH_9lD4hLP_AdiVdWbLLPv

Reply
0 Kudos
continuum
Immortal
Immortal

This time the dump is incomplete again.

Show us the result of

ls -lisah /dev/disks/*

please


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Vacadeluna
Contributor
Contributor

Here you go, I am confused as to which one it is tbh, but after running the command YOU gave me, I see multiple results of larger structures there.   

    148 7733248 -rw-------    1 root     root        7.4G May  5 19:49 /dev/disks/mpx.vmhba32:C0:T0:L0

    136   4064 -rw-------    1 root     root        4.0M May  5 19:49 /dev/disks/mpx.vmhba32:C0:T0:L0:1

    138 255984 -rw-------    1 root     root      250.0M May  5 19:49 /dev/disks/mpx.vmhba32:C0:T0:L0:5

    140 255984 -rw-------    1 root     root      250.0M May  5 19:49 /dev/disks/mpx.vmhba32:C0:T0:L0:6

    142 112624 -rw-------    1 root     root      110.0M May  5 19:49 /dev/disks/mpx.vmhba32:C0:T0:L0:7

    144 292848 -rw-------    1 root     root      286.0M May  5 19:49 /dev/disks/mpx.vmhba32:C0:T0:L0:8

    146 2621440 -rw-------    1 root     root        2.5G May  5 19:49 /dev/disks/mpx.vmhba32:C0:T0:L0:9

    151 3905945600 -rw-------    1 root     root        3.6T May  5 19:49 /dev/disks/naa.6d4ae520b1938000207d8ddb074f920c

    149      0 lrwxrwxrwx    1 root     root          20 May  5 19:49 /dev/disks/vml.0000000000766d68626133323a303a30 -> mpx.vmhba32:C0:T0:L0

    137      0 lrwxrwxrwx    1 root     root          22 May  5 19:49 /dev/disks/vml.0000000000766d68626133323a303a30:1 -> mpx.vmhba32:C0:T0:L0:1

    139      0 lrwxrwxrwx    1 root     root          22 May  5 19:49 /dev/disks/vml.0000000000766d68626133323a303a30:5 -> mpx.vmhba32:C0:T0:L0:5

    141      0 lrwxrwxrwx    1 root     root          22 May  5 19:49 /dev/disks/vml.0000000000766d68626133323a303a30:6 -> mpx.vmhba32:C0:T0:L0:6

    143      0 lrwxrwxrwx    1 root     root          22 May  5 19:49 /dev/disks/vml.0000000000766d68626133323a303a30:7 -> mpx.vmhba32:C0:T0:L0:7

    145      0 lrwxrwxrwx    1 root     root          22 May  5 19:49 /dev/disks/vml.0000000000766d68626133323a303a30:8 -> mpx.vmhba32:C0:T0:L0:8

    147      0 lrwxrwxrwx    1 root     root          22 May  5 19:49 /dev/disks/vml.0000000000766d68626133323a303a30:9 -> mpx.vmhba32:C0:T0:L0:9

    152      0 lrwxrwxrwx    1 root     root          36 May  5 19:49 /dev/disks/vml.02000000006d4ae520b1938000207d8ddb074f920c504552432036 -> naa.6d4ae520b1938000207d8ddb074f920c

Reply
0 Kudos
continuum
Immortal
Immortal

where is partition 3 ?

please check with partedUtil


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Vacadeluna
Contributor
Contributor

This RAID may be so messed up that it is gone as it says that file or directory does not exist...I have never used the partedUtil before or actually checked out the file structure of VMWare, so I may need a little extra guidance on that part.

Reply
0 Kudos
a_p_
Leadership
Leadership

Ulli, if I interpret the numbers correctly, there's no partition3 (VMFS) on the installation device, because it's most likely an 8GB USB/SD device.

I guess that the RAID volume is the 3.6TB one.


André

Reply
0 Kudos
Vacadeluna
Contributor
Contributor

That is correct, we boot off of a USB drive (8GB) and we use the RAID for the DS on the host.

Reply
0 Kudos
continuum
Immortal
Immortal

can you show the command you used to create the second dump ?

by the way - how do you guys manage to post here at the moment ?

can only reach it from a phone at the moment


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Vacadeluna
Contributor
Contributor

I honestly did not see your reply yesterday, but I have had no troubles posting on here from a desktop.

This is the command that I used, and I just changed the name and the device id to one of the devices, which would have been the biggest one I saw before the command you told me to run yesterday.

dd if=/dev/disks/Device:1 bs=1M count=1536 of=/tmp/replace with your name.1536

Reply
0 Kudos
a_p_
Leadership
Leadership

That large disks doesn't  show partitions anymore.

Unless you're already in contact with continuum​, would you mind to run the offset ...; done command from step 1 at https://kb.vmware.com/s/article/2046610​ and post the output?

André

Reply
0 Kudos
Vacadeluna
Contributor
Contributor

I am trying to get it for you, but its taking some time, so i may have top let it go and post the results int he morning tomorrow (morning where I am).

Reply
0 Kudos
Vacadeluna
Contributor
Contributor

[root@localhost:~]  offset="128 2048"; for dev in `esxcfg-scsidevs -l | grep "Co

nsole Device:" | awk {'print $3'}`; do disk=$dev; echo $disk; partedUtil getptbl

$disk; { for i in `echo $offset`; do echo "Checking offset found at $i:"; hexdu

mp -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i))) $

disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1

-A 5 d00d; echo "---------------------"; done

/vmfs/devices/cdrom/mpx.vmhba1:C0:T0:L0

Error: The device /dev/cdrom/mpx.vmhba1:C0:T0:L0 has zero length, and can't poss                                                                                       ibly store a file system or partition table.  Perhaps you selected the wrong dev                                                                                       ice?

Unable to get device /vmfs/devices/cdrom/mpx.vmhba1:C0:T0:L0

---------------------

/vmfs/devices/disks/mpx.vmhba32:C0:T0:L0

gpt

962 255 63 15466496

1 64 8191 C12A7328F81F11D2BA4B00A0C93EC93B systemPartition 128

5 8224 520191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

6 520224 1032191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

7 1032224 1257471 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0

8 1257504 1843199 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

9 1843200 7086079 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0

---------------------

/vmfs/devices/disks/naa.6d4ae520b1938000207d8ddb074f920c

unknown

486267 255 63 7811891200

hexdump: /vmfs/devices/disks/naa.6d4ae520b1938000207d8ddb074f920c: Connection timed out

hexdump: /vmfs/devices/disks/naa.6d4ae520b1938000207d8ddb074f920c: Connection timed out

hexdump: /vmfs/devices/disks/naa.6d4ae520b1938000207d8ddb074f920c: Connection timed out

hexdump: /vmfs/devices/disks/naa.6d4ae520b1938000207d8ddb074f920c: Connection timed out

hexdump: /vmfs/devices/disks/naa.6d4ae520b1938000207d8ddb074f920c: Connection timed out

hexdump: /vmfs/devices/disks/naa.6d4ae520b1938000207d8ddb074f920c: Connection timed out

---------------------

/vmfs/devices/genscsi/t10.DP______BACKPLANE000000

Error: The device /dev/genscsi/t10.DP______BACKPLANE000000 has zero length, and can't possibly store a file system or partition table.  Perhaps you selected the wrong device?

Unable to get device /vmfs/devices/genscsi/t10.DP______BACKPLANE000000

Reply
0 Kudos