datastore across 3 disks and one dies

nubbbbb · ‎06-27-2022

continuum save me 🙂 ESXi 6.7.0 Update 3 (Build 15160138)

home lab, datastore spanned across 3 identical intel ssd. seems one of the ssd is rekt. datastore is still browsable but cant cp any data off without IO errors and death.

obviously i am prob screwed but i am hoping for some YOLO style suggestions. Ideally maybe just some light corruption is preventing me from saving the data.

the disk still appears as a storage adapter, i can read the size of the disk but the partition table is unknown.

Some useful info on the disk after running this

`offset="128 2048"; for dev in `esxcfg-scsidevs -l | grep "Console Device:" | awk {'print $3'}`; do disk=$dev; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do echo "Check
ing offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1 -A 5 d00d; echo "---------------------"; done`

/vmfs/devices/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL536302FM800RGN__
unknown
97281 255 63 1562824368
hexdump: /vmfs/devices/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL536302FM800RGN__: Input/output error
hexdump: /vmfs/devices/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL536302FM800RGN__: Input/output error
hexdump: /vmfs/devices/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL536302FM800RGN__: Input/output error
hexdump: /vmfs/devices/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL536302FM800RGN__: Input/output error
hexdump: /vmfs/devices/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL536302FM800RGN__: Input/output error
hexdump: /vmfs/devices/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL536302FM800RGN__: Input/output error
---------------------

continuum · ‎06-27-2022

What is YOLO style ???
Anyway - so you expanded the datastore on parent SSD twice adding two more extents ? THats a brave move - are you a Klingon ?
Anyways - are all three disks still readable ?
Show first full putty screens when running

hexdump -C /dev/disks/parent-ssd-partNr | less
hexdump -C /dev/disks/child1-ssd-partNr | less
hexdump -C /dev/disks/child2-ssd-partNr | less

make a large putty screenshot for all three commands.
To warn you in advance - this is not trivial and can become a nightmare.
Show those screenshots .... (what is putty - is not an acceptable answer)

Ulli

________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

nubbbbb · ‎06-27-2022

YOLO == You Only Live Once. My grandfather was klingon so i am 1/4 klingon. I hope SecureCRT output just as good as PUTTY

i have attached the requested screenshots. realistically here is our issue xD

hexdump: /dev/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL536302FM800RGN__: Input/output error

so the drive completely dead... any incantations you know of?

nubbbbb · ‎06-27-2022

i dont think i replied properly on this thread but the info is above.

continuum · ‎06-27-2022

The screenshots are still in quarantine ? - have not seen that before ....

________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

nubbbbb · ‎06-27-2022

here is an imgur link, they are just PNG from windows snipping tool.

https://imgur.com/a/1KElWqE

continuum · ‎06-27-2022

Is the disk with the I/O error the first or second child ?
You did not give a partitionnumber for the last screenshot - why not ?
Is the partitiontable broken ?

________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

nubbbbb · ‎06-27-2022

apologies, the filenames did not come through on the imgur link. they are listed extend0 to extent2 top to bottom respectively.

here are the exact commands i ran, in order from extent0 to extent2

hexdump -C /dev/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL5363020G800RGN__:1 | less

hexdump -C /dev/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL536302GU800RGN__:1 | less

hexdump -C /dev/disks/t10.ATA_____INTEL_SSDSC2BB800G4_____________________PHWL536302FM800RGN__ | less

here is a screen shot of the datastore information from the esxi page. extent2 is missing.

nubbbbb · ‎06-27-2022

i missed your question. there is no partition listed so likely the partition table is broken.

continuum · ‎06-27-2022

Looks like we need to set a new partitiontable - which probably is impossible using ESXi.
Then we need to read/clone the disk with the i/o error with ddrescue.
Lets switch to skype
Ulli

________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

All

datastore across 3 disks and one dies