VMware Cloud Community
solid38
Contributor
Contributor

ESXi 6.5.0 - Datastore lost

Hi,

I've lost a datastore.

In my server, I have an SDD and 3 hard drives.
Suddenly, one of the hard disks disappeared from the datastore. Unfortunately, I can not put it back up.
I followed the following procedure: https://kb.vmware.com/s/article/2046610

See the hard drive disk in bold below :

[root@ESXi:~] offset="128 2048"; for dev in `esxcfg-scsidevs -l | grep "Console Device:" | awk {'print

$3'}`; do disk=$dev; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do echo "Checking

offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i

))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1 -A 5 d00d; echo "-

--------------------"; done

/vmfs/devices/disks/mpx.vmhba32:C0:T0:L0

msdos

974 255 63 15663104

4 32 8191 4 128

1 8192 1843199 5 0

5 8224 520191 6 0

6 520224 1032191 6 0

7 1032224 1257471 252 0

8 1257504 1843199 6 0

---------------------

/vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc

gpt

364797 255 63 5860467632

1 2048 5860463804 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

Checking offset found at 2048:

0200000 d00d c001

0200004

1400000 f15e 2fab

1400004

0140001d  44 44 5f 33 00 00 00 00  00 00 00 00 00 00 00 00  |DD_3............|

0140002d  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

---------------------

/vmfs/devices/disks/naa.600508b1001c413760f7cf004a8de5ab

gpt

243197 255 63 3906963632

3 128 3906963592 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

Checking offset found at 128:

0110000 d00d c001

0110004

1310000 f15e 2fab

1310004

0131001d  4a 4f 44 57 4e 00 00 00  00 00 00 00 00 00 00 00  |JODWN...........|

0131002d  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

---------------------

/vmfs/devices/disks/naa.600508b1001c71fc01427153a69ab9c7

gpt

364797 255 63 5860467632

1 128 5860467592 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

Checking offset found at 128:

0110000 d00d c001

0110004

1310000 f15e 2fab

1310004

0131001d  44 44 5f 32 00 00 00 00  00 00 00 00 00 00 00 00  |DD_2............|

0131002d  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

---------------------

/vmfs/devices/disks/naa.600508b1001cacb17e7e074c24ad122d

gpt

58365 255 63 937637552

1 128 937637512 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

Checking offset found at 128:

0110000 d00d c001

0110004

1310000 f15e 2fab

1310004

0131001d  56 4d 5f 44 41 54 41 53  54 4f 52 45 00 00 00 00  |VM_DATASTORE....|

0131002d  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

---------------------

In the ESXi interface, this hard drive disk is not visible :

snip_20180527180908.png

Do you an idea to restore this hard drive disk ?

Thanks !

Best regards.

Tags (1)
0 Kudos
20 Replies
continuum
Immortal
Immortal

Hi
can you please run the following command:
dd if=/vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1 bs=1M count=1536 of=/vmfs/volumes/VM_DATASTORE/ba7dc.1536

That should create a file of 1536MB named ba7dc.1536.
Download that file to your admin-machine , compress it and provide a downloadlink.
Then I will have a look
Ulli
skype = sanbarrow


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
solid38
Contributor
Contributor

Hi,

Thank you for your response.

I have done your command, but I have somme errors :

[root@ESXi:~] dd if=/vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1 b

s=1M count=1536 of=/vmfs/volumes/VM_DATASTORE/ba7dc.1536

dd: /vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1: Connection timed out

[root@ESXi:~] dd if=/vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1 b

s=1M count=1536 of=/vmfs/volumes/VM_DATASTORE/ba7dc.1536

dd: /vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1: Input/output error

After the first try I retrieved the generated file (see file attached).

Have a good day.

0 Kudos
continuum
Immortal
Immortal

So you get an I/O error during the dd-command ?
That explains why the dump is truncated.
I only found a reference to a single directory named "ESXi_XPEnology"
Anyway - I think we need to switch to a different procedure to work around the I/O error.
Please download http://sanbarrow.com/livecds/moa64-nogui/MOA64-nogui-incl-src-111014-efi.iso
and contact me via skype.
Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
solid38
Contributor
Contributor

I tested the command again, and here is the result in the shell:

[root@ESXi:~] dd if=/vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1 b

s=1M count=1536 of=/vmfs/volumes/VM_DATASTORE/ba7dc.1536

dd: /vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1: Input/output error

[root@ESXi:~]

Oddly, the created file is still about 28MB ...

ESXi_XPEnology is a VM who use this hard drive disk.

Hard drive disk is only used by this VM.

I downloaded the iso file, what should I do with it?

For information, I had already had a loss of this disk about 80 days ago.
I reformatted it with the HP utility on my server (Micro Server Gen8), and then recreated the datastore since it was visible again in the ESXi interface.
From memory, the HP utility did not report any bad sectors or problems.

PS : thank you for taking the time to help me, i appreciate enormously.

0 Kudos
continuum
Immortal
Immortal

If the start of the VMFS-volume has a I/O error after 28MB this means that we can forget about reading the volume from ESXi.
But if we access the volume from Linux we can probably workaround the I/O error.
However the procedure to do so is undocumented and it would take us days to make progress if we handle this via the forum.
Thats why I would suggest that we create a Teamviewer session
http://download.teamviewer.com/download/version_10x/TeamViewerQS.exe
so that I can do the next steps myself with your assistance.
Please call me via skype to continue ....

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
solid38
Contributor
Contributor

OK, thanks
I am in France (GMT +2, summer time), and currently I have to take care of my children.
When is it possible to do a skype?
For information, I can write and understand English in writing, but I'm not good at speaking ...

I just used the HP utilities at server startup (Intelligent Provisionning), I looked at the RAID system and all disks have an "OK" status.

0 Kudos
continuum
Immortal
Immortal

I am in germany and am available now.


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
solid38
Contributor
Contributor

I would prefer tomorrow, is it possible for you ?

0 Kudos
continuum
Immortal
Immortal

Yes - you can call after 10:00


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
Jichen001
Contributor
Contributor

you can try this website to check the storage issue.

www.vmcheck.net

0 Kudos
continuum
Immortal
Immortal

This is a joke isn't it ???


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
solid38
Contributor
Contributor

I'm really sorry, but at the moment I am very busy and it is difficult to find a niche during the week ...

For information, a friend advised me to use the following command:

==> voma -f check -d /vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc

The result is :

[root@ESXi:~] voma -f check -d /vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc

Module name is missing. Using "vmfs" as default

Checking if device is actively used by other hosts

Running VMFS Checker version 2.1 in check mode

Initializing LVM metadata, Basic Checks will be done

Phase 1: Checking VMFS header and resource files

   Detected VMFS-6 file system (labeled:'DD_3') with UUID:5a8b2bde-2bc24dbb-668e-941882374788, Version 6:81

         ERROR: IO failed: Input/output error

ON-DISK ERROR: Corruption too severe in resource file [LFB]

         ERROR: Failed to check fbb.sf.

   VOMA failed to check device : IO error

Total Errors Found:           1

   Kindly Consult VMware Support for further assistance

[root@ESXi:~]

I really feel that it does not smell good ....

0 Kudos
continuum
Immortal
Immortal

voma is not a useful tool for this scenario.
Anyway  - I already told you that I have a workaround for I/O errors ....


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
solid38
Contributor
Contributor

I ran a series of tests to hard drive and it turns out that some blocks are defective. He does not pass the tests.
I opened an RMA with Western Digital.

0 Kudos
dr_robot
Enthusiast
Enthusiast

Hello @continuum ,

Have an error with a datastore that isn't accessible. The LUN is seen but not the datastore. See the following errors when I ran

offset="128 2048"; for dev in `esxcfg-scsidevs -l | grep "Console Device:" | awk {'print $3'}`; do disk=$dev; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do ech
o "Checking offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1
-A 5 d00d; echo "---------------------"; done

---------------------
/vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c
unknown
935722 255 63 15032385536
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error

And below when I run:

partedUtil getptbl "vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa726d"
unknown

How can I recover from this and bring datastore back up.

 

0 Kudos
continuum
Immortal
Immortal

You use 2 different naa numbers - is that a typo ?
When a hexdump against a device listed in /dev/disks fails with an I/O error the most straight forward approach is to clone the device with ddrescue.

By the way - if you want a quick reply from me - dont add a message to an old post - call me via skype instead.

 

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
R0by100
Contributor
Contributor

i have the same problem, could you help me ?

 

0 Kudos
R0by100
Contributor
Contributor

[root@localhost:~] voma -m vmfs -f check -d /vmfs/devices/disks/naa.5000c5005114f955:1

Checking if device is actively used by other hosts

Initializing VMFS Checker..|Scanning for VMFS-3/VMFS-5 host activity (512 bytes/HB, 2048 HBs).

Running VMFS Checker version 1.2 in check mode

Initializing LVM metadata, Basic Checks will be done

Phase 1: Checking VMFS header and resource files

   Detected VMFS file system (labeled:'3T Interno') with UUID:55965fb9-a4edac3c-c55c-d067e5fc92ae, Version 5:54

         ERROR: IO failed: Input/output error

ON-DISK ERROR: Corruption too severe in resource file [PB]

         ERROR: Failed to check pbc.sf.

   VOMA failed to check device : IO error

 

Total Errors Found:           1

   Kindly Consult VMware Support for further assistance

[root@localhost:~]

0 Kudos
continuum
Immortal
Immortal

Hi
if you need to deal with I/O errors on a VMFS-5-volume you have a limited numbers of option.

1. hire a company like Kroll Ontrack
2. call VMware-support
3. search forums and google for recipes and instructions
4. call me via skype

In my experience #2 is hopeless, #3 is too dangerous if the data is important, #1 requires patience and big pockets, #4 is the fastest

And no - trying to solve this problem within a forum discussion has such a low success rate that I  dont   try that anymore.

 

Ulli

 


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos