Hi,
I've lost a datastore.
In my server, I have an SDD and 3 hard drives.
Suddenly, one of the hard disks disappeared from the datastore. Unfortunately, I can not put it back up.
I followed the following procedure: https://kb.vmware.com/s/article/2046610
See the hard drive disk in bold below :
[root@ESXi:~] offset="128 2048"; for dev in `esxcfg-scsidevs -l | grep "Console Device:" | awk {'print
$3'}`; do disk=$dev; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do echo "Checking
offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i
))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1 -A 5 d00d; echo "-
--------------------"; done
/vmfs/devices/disks/mpx.vmhba32:C0:T0:L0
msdos
974 255 63 15663104
4 32 8191 4 128
1 8192 1843199 5 0
5 8224 520191 6 0
6 520224 1032191 6 0
7 1032224 1257471 252 0
8 1257504 1843199 6 0
---------------------
/vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc
gpt
364797 255 63 5860467632
1 2048 5860463804 AA31E02A400F11DB9590000C2911D1B8 vmfs 0
Checking offset found at 2048:
0200000 d00d c001
0200004
1400000 f15e 2fab
1400004
0140001d 44 44 5f 33 00 00 00 00 00 00 00 00 00 00 00 00 |DD_3............|
0140002d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
---------------------
/vmfs/devices/disks/naa.600508b1001c413760f7cf004a8de5ab
gpt
243197 255 63 3906963632
3 128 3906963592 AA31E02A400F11DB9590000C2911D1B8 vmfs 0
Checking offset found at 128:
0110000 d00d c001
0110004
1310000 f15e 2fab
1310004
0131001d 4a 4f 44 57 4e 00 00 00 00 00 00 00 00 00 00 00 |JODWN...........|
0131002d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
---------------------
/vmfs/devices/disks/naa.600508b1001c71fc01427153a69ab9c7
gpt
364797 255 63 5860467632
1 128 5860467592 AA31E02A400F11DB9590000C2911D1B8 vmfs 0
Checking offset found at 128:
0110000 d00d c001
0110004
1310000 f15e 2fab
1310004
0131001d 44 44 5f 32 00 00 00 00 00 00 00 00 00 00 00 00 |DD_2............|
0131002d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
---------------------
/vmfs/devices/disks/naa.600508b1001cacb17e7e074c24ad122d
gpt
58365 255 63 937637552
1 128 937637512 AA31E02A400F11DB9590000C2911D1B8 vmfs 0
Checking offset found at 128:
0110000 d00d c001
0110004
1310000 f15e 2fab
1310004
0131001d 56 4d 5f 44 41 54 41 53 54 4f 52 45 00 00 00 00 |VM_DATASTORE....|
0131002d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
---------------------
In the ESXi interface, this hard drive disk is not visible :
Do you an idea to restore this hard drive disk ?
Thanks !
Best regards.
Hi
can you please run the following command:
dd if=/vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1 bs=1M count=1536 of=/vmfs/volumes/VM_DATASTORE/ba7dc.1536
That should create a file of 1536MB named ba7dc.1536.
Download that file to your admin-machine , compress it and provide a downloadlink.
Then I will have a look
Ulli
skype = sanbarrow
Hi,
Thank you for your response.
I have done your command, but I have somme errors :
[root@ESXi:~] dd if=/vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1 b
s=1M count=1536 of=/vmfs/volumes/VM_DATASTORE/ba7dc.1536
dd: /vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1: Connection timed out
[root@ESXi:~] dd if=/vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1 b
s=1M count=1536 of=/vmfs/volumes/VM_DATASTORE/ba7dc.1536
dd: /vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1: Input/output error
After the first try I retrieved the generated file (see file attached).
Have a good day.
So you get an I/O error during the dd-command ?
That explains why the dump is truncated.
I only found a reference to a single directory named "ESXi_XPEnology"
Anyway - I think we need to switch to a different procedure to work around the I/O error.
Please download http://sanbarrow.com/livecds/moa64-nogui/MOA64-nogui-incl-src-111014-efi.iso
and contact me via skype.
Ulli
I tested the command again, and here is the result in the shell:
[root@ESXi:~] dd if=/vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1 b
s=1M count=1536 of=/vmfs/volumes/VM_DATASTORE/ba7dc.1536
dd: /vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc:1: Input/output error
[root@ESXi:~]
Oddly, the created file is still about 28MB ...
ESXi_XPEnology is a VM who use this hard drive disk.
Hard drive disk is only used by this VM.
I downloaded the iso file, what should I do with it?
For information, I had already had a loss of this disk about 80 days ago.
I reformatted it with the HP utility on my server (Micro Server Gen8), and then recreated the datastore since it was visible again in the ESXi interface.
From memory, the HP utility did not report any bad sectors or problems.
PS : thank you for taking the time to help me, i appreciate enormously.
If the start of the VMFS-volume has a I/O error after 28MB this means that we can forget about reading the volume from ESXi.
But if we access the volume from Linux we can probably workaround the I/O error.
However the procedure to do so is undocumented and it would take us days to make progress if we handle this via the forum.
Thats why I would suggest that we create a Teamviewer session
http://download.teamviewer.com/download/version_10x/TeamViewerQS.exe
so that I can do the next steps myself with your assistance.
Please call me via skype to continue ....
Ulli
OK, thanks
I am in France (GMT +2, summer time), and currently I have to take care of my children.
When is it possible to do a skype?
For information, I can write and understand English in writing, but I'm not good at speaking ...
I just used the HP utilities at server startup (Intelligent Provisionning), I looked at the RAID system and all disks have an "OK" status.
I am in germany and am available now.
I would prefer tomorrow, is it possible for you ?
Yes - you can call after 10:00
you can try this website to check the storage issue.
This is a joke isn't it ???
I'm really sorry, but at the moment I am very busy and it is difficult to find a niche during the week ...
For information, a friend advised me to use the following command:
==> voma -f check -d /vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc
The result is :
[root@ESXi:~] voma -f check -d /vmfs/devices/disks/naa.600508b1001c2b63c3e5216dbe1ba7dc
Module name is missing. Using "vmfs" as default
Checking if device is actively used by other hosts
Running VMFS Checker version 2.1 in check mode
Initializing LVM metadata, Basic Checks will be done
Phase 1: Checking VMFS header and resource files
Detected VMFS-6 file system (labeled:'DD_3') with UUID:5a8b2bde-2bc24dbb-668e-941882374788, Version 6:81
ERROR: IO failed: Input/output error
ON-DISK ERROR: Corruption too severe in resource file [LFB]
ERROR: Failed to check fbb.sf.
VOMA failed to check device : IO error
Total Errors Found: 1
Kindly Consult VMware Support for further assistance
[root@ESXi:~]
I really feel that it does not smell good ....
voma is not a useful tool for this scenario.
Anyway - I already told you that I have a workaround for I/O errors ....
I ran a series of tests to hard drive and it turns out that some blocks are defective. He does not pass the tests.
I opened an RMA with Western Digital.
Hello @continuum ,
Have an error with a datastore that isn't accessible. The LUN is seen but not the datastore. See the following errors when I ran
offset="128 2048"; for dev in `esxcfg-scsidevs -l | grep "Console Device:" | awk {'print $3'}`; do disk=$dev; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do ech
o "Checking offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1
-A 5 d00d; echo "---------------------"; done
---------------------
/vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c
unknown
935722 255 63 15032385536
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
hexdump: /vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa848c: Input/output error
And below when I run:
partedUtil getptbl "vmfs/devices/disks/naa.6006016073c04200e0d3975bc7aa726d"
unknown
How can I recover from this and bring datastore back up.
You use 2 different naa numbers - is that a typo ?
When a hexdump against a device listed in /dev/disks fails with an I/O error the most straight forward approach is to clone the device with ddrescue.
By the way - if you want a quick reply from me - dont add a message to an old post - call me via skype instead.
Ulli
i have the same problem, could you help me ?
[root@localhost:~] voma -m vmfs -f check -d /vmfs/devices/disks/naa.5000c5005114f955:1
Checking if device is actively used by other hosts
Initializing VMFS Checker..|Scanning for VMFS-3/VMFS-5 host activity (512 bytes/HB, 2048 HBs).
Running VMFS Checker version 1.2 in check mode
Initializing LVM metadata, Basic Checks will be done
Phase 1: Checking VMFS header and resource files
Detected VMFS file system (labeled:'3T Interno') with UUID:55965fb9-a4edac3c-c55c-d067e5fc92ae, Version 5:54
ERROR: IO failed: Input/output error
ON-DISK ERROR: Corruption too severe in resource file [PB]
ERROR: Failed to check pbc.sf.
VOMA failed to check device : IO error
Total Errors Found: 1
Kindly Consult VMware Support for further assistance
[root@localhost:~]
Hi
if you need to deal with I/O errors on a VMFS-5-volume you have a limited numbers of option.
1. hire a company like Kroll Ontrack
2. call VMware-support
3. search forums and google for recipes and instructions
4. call me via skype
In my experience #2 is hopeless, #3 is too dangerous if the data is important, #1 requires patience and big pockets, #4 is the fastest
And no - trying to solve this problem within a forum discussion has such a low success rate that I dont try that anymore.
Ulli