Hello,
I have a datastore called "Secondary", this datastore was apparently extended over two different drives. Another person thought it would be a good idea to swap out the Raid Controller. They did that and then reconnected the old raid controller when nothing worked.
Anyway. Here we are. We have Disk Internals VMFS Recovery software but it seems to crash at 1TB transferred. Is there a way to rebuild the datastore without losing data and having ESXI see the datastore as it did before? Some information follows.
[root@vmserver01:~] esxcfg-scsidevs -m
GetUnmountedVmfsFileSystemsInt: fsUUID is null, skipping naa.6a4badb00fbbf00022583f81158e1c2b:1
GetUnmountedVmfsFileSystemsInt: fsUUID is null, skipping naa.6a4badb00fbbf00022583f81158e1c2b:1
GetUnmountedVmfsFileSystemsInt: fsUUID is null, skipping naa.6a4badb00fbbf00022583f81158e1c2b:1
[root@vmserver01:~] esxcfg-volume -l
VMFS UUID/label: 59d18cea-83868190-73f8-782bcb39f071/Secondary
Can mount: No (some extents missing)
Can resignature: No (some extents missing)
Extent name: naa.6a4badb00fbbf00026435d4d2207a1c1:1 range: 0 - 3814143 (MB)
[root@vmserver01:~] esxcli storage vmfs snapshot list
59d18cea-83868190-73f8-782bcb39f071
Volume Name: Secondary
VMFS UUID: 59d18cea-83868190-73f8-782bcb39f071
Can mount: false
Reason for un-mountability: some extents missing
Can resignature: false
Reason for non-resignaturability: some extents missing
Unresolved Extent Count: 1
[root@vmserver01:~] ls -lah /dev/disks
total 13795133312
drwxr-xr-x 2 root root 512 May 6 15:37 .
drwxr-xr-x 16 root root 512 May 6 15:37 ..
-rw------- 1 root root 2.7T May 6 15:37 naa.6a4badb00fbbf00022583f81158e1c2b
-rw------- 1 root root 2.7T May 6 15:37 naa.6a4badb00fbbf00022583f81158e1c2b:1
-rw------- 1 root root 3.6T May 6 15:37 naa.6a4badb00fbbf00026435d4d2207a1c1
-rw------- 1 root root 3.6T May 6 15:37 naa.6a4badb00fbbf00026435d4d2207a1c1:1
-rw------- 1 root root 116.7G May 6 15:37 t10.SanDisk00Ultra00000000000000000000004C530001040413105080
-rw------- 1 root root 4.0M May 6 15:37 t10.SanDisk00Ultra00000000000000000000004C530001040413105080:1
-rw------- 1 root root 250.0M May 6 15:37 t10.SanDisk00Ultra00000000000000000000004C530001040413105080:5
-rw------- 1 root root 250.0M May 6 15:37 t10.SanDisk00Ultra00000000000000000000004C530001040413105080:6
-rw------- 1 root root 110.0M May 6 15:37 t10.SanDisk00Ultra00000000000000000000004C530001040413105080:7
-rw------- 1 root root 286.0M May 6 15:37 t10.SanDisk00Ultra00000000000000000000004C530001040413105080:8
-rw------- 1 root root 2.5G May 6 15:37 t10.SanDisk00Ultra00000000000000000000004C530001040413105080:9
lrwxrwxrwx 1 root root 60 May 6 15:37 vml.01000000003443353330303031303430343133313035303830556c74726120 -> t10.SanDisk00Ultra00000000000000000000004C530001040413105080
lrwxrwxrwx 1 root root 62 May 6 15:37 vml.01000000003443353330303031303430343133313035303830556c74726120:1 -> t10.SanDisk00Ultra00000000000000000000004C530001040413105080:1
lrwxrwxrwx 1 root root 62 May 6 15:37 vml.01000000003443353330303031303430343133313035303830556c74726120:5 -> t10.SanDisk00Ultra00000000000000000000004C530001040413105080:5
lrwxrwxrwx 1 root root 62 May 6 15:37 vml.01000000003443353330303031303430343133313035303830556c74726120:6 -> t10.SanDisk00Ultra00000000000000000000004C530001040413105080:6
lrwxrwxrwx 1 root root 62 May 6 15:37 vml.01000000003443353330303031303430343133313035303830556c74726120:7 -> t10.SanDisk00Ultra00000000000000000000004C530001040413105080:7
lrwxrwxrwx 1 root root 62 May 6 15:37 vml.01000000003443353330303031303430343133313035303830556c74726120:8 -> t10.SanDisk00Ultra00000000000000000000004C530001040413105080:8
lrwxrwxrwx 1 root root 62 May 6 15:37 vml.01000000003443353330303031303430343133313035303830556c74726120:9 -> t10.SanDisk00Ultra00000000000000000000004C530001040413105080:9
lrwxrwxrwx 1 root root 36 May 6 15:37 vml.02000000006a4badb00fbbf00022583f81158e1c2b504552432036 -> naa.6a4badb00fbbf00022583f81158e1c2b
lrwxrwxrwx 1 root root 38 May 6 15:37 vml.02000000006a4badb00fbbf00022583f81158e1c2b504552432036:1 -> naa.6a4badb00fbbf00022583f81158e1c2b:1
lrwxrwxrwx 1 root root 36 May 6 15:37 vml.02000000006a4badb00fbbf00026435d4d2207a1c1504552432036 -> naa.6a4badb00fbbf00026435d4d2207a1c1
lrwxrwxrwx 1 root root 38 May 6 15:37 vml.02000000006a4badb00fbbf00026435d4d2207a1c1504552432036:1 -> naa.6a4badb00fbbf00026435d4d2207a1c1:1
Create backup-patches first:
dd if=/dev/disks/naa.6a4badb00fbbf00026435d4d2207a1c1:1 of=/tmp/backup-parent.bin bs=1M count=1 skip=1
dd if=/dev/disks/naa.6a4badb00fbbf00022583f81158e1c2b:1 of=/tmp/backup-extent.bin bs=1M count=1 skip=1
Command to inject the patched sections:
dd of=/dev/disks/naa.6a4badb00fbbf00026435d4d2207a1c1:1 if=/tmp/patched-parent.bin bs=1M count=1 seek=1 conv=notrunc
dd of=/dev/disks/naa.6a4badb00fbbf00022583f81158e1c2b:1 if=/tmp/patched-extent.bin bs=1M count=1 seek=1 conv=notrunc
Note for anybody reading this with similar problems: !!!
these patches were made for this special case. They are not for you !!!
So you say that the Secondary Datastore once had a capacity of about 12 TB ?
Now you see a 6.36 Tb datastore with 2 extents missing ?
Why is the 6.36 TB disk not listed in /dev/disks ?
Can you attach the result for
vmkfstools -P -v10 /vmfs/volumes/Secondary/
Hello Continuum.
The datastore named 'secondary' is approximately 6.3TB. As shown in the first Screenshot, it resides "across" two physical disks. That piece of software can see it and it can mount it through SSH.
Since that program seems to crash at the 1TB transferred mark, I'm now seeing if its possible to get ESXI to see that datastore and then use it without losing data.
The 'secondary' (6.3TB) datastore is apparently missing an extent, as shown from the "esxcfg-volume -l" command.
Attached is the result of the command you asked for.
please run
vmkfstools -P -v10 against the datastores that are listed in /vmfs/volumes/
I still do not get the full picture ...
Do you have one readable datastore with 1 missing extent ? - or one readable datastore with 2 missing datastores ?
The idea is ...
at the moment you detect the parent datastore as a snapshot.
You cant mount it because the info for the second (and third extent ? ) is no longer valid.
To fix that you have to patch two entries in the header of the parent datastore.
There is only 1 datastore called "secondary". I know the naming of the datastore makes it seem like there are multiple but there is only 1.
How do I go about patching the parent datastore?
If you expand a Datastore by adding an extent you get one "named datastore" plus one or more vmfs-volumes that dont get a name.
So again - was the datastore "secondary" composed from one "parent datastore" plus one additional disk or was it composed with one parent datastore plus two added extents ?
Okay.
datastore "secondary" had an extent added to it. "Secondary" was originally on disk "naa.6a4badb00fbbf00026435d4d2207a1c1".
As seen in the first screen shot, the added extent is "VMFS Volume 1" on "/dev/disks/naa.6a4badb00fbbf00022583f81158e1c2b".
So it spans across these two disks.
Look at this hexdumps - this is how that looks when it is healthy
command: hexdump -C /dev/disks/parent-datastore-partition | less
the extent looks like this:
command: hexdump -C /dev/disks/missing-extent-partition | less
In your case the devicenames will differ - we need to know how the same dumps look like in your case.
Then we can figure out how to do the patching.
Alright. I forgot to mention that we are using VMFS6.
Not sure if that changes anything.
Edit:
(PARENT PARTITION)
hexdump -C /dev/disks/naa.6a4badb00fbbf00026435d4d2207a1c1:1 | less
(EXTENT PARTITION)
hexdump -C /dev/disks/naa.6a4badb00fbbf00022583f81158e1c2b:1 | less
We need to be precise from now on:
show the command you used to create the last hexdump.
Next create the second hexdump - again include the full command.
I've edited my above comment with full details.
Run
dd if=/dev/disks/naa.6a4badb00fbbf00026435d4d2207a1c1:1 of=/tmp/parent.bin bs=1M count=1 skip=1
dd if=/dev/disks/naa.6a4badb00fbbf00022583f81158e1c2b:1 of=/tmp/extent.bin bs=1M count=1 skip=1
zip both bin-files and attach them
Attached are the requested bins.
Used both commands as follows:
[root@vmserver01:/bin] dd if=/dev/disks/naa.6a4badb00fbbf00026435d4d2207a1c1:1 of=/tmp/parent.bin bs=1M count=1 skip=1
[root@vmserver01:/bin] dd if=/dev/disks/naa.6a4badb00fbbf00022583f81158e1c2b:1 of=/tmp/extent.bin bs=1M count=1 skip=1
got all I need
this may take a while
Do you feel comfortable about trying something completely undocumented - which has the potential to destroy your datastore if done wrong - on your own ?
When I send you the patched versions I assume that you have a full backup of both disks - or at least created a dump with the first 2gb for both disks.
I would prefer if we do it together ...
Yeah. We could try that.
I am shutting down the server to add another 6TB HDD.
Okay. I think we're ready to try it continuum
Here are the patched files - call me via skype if you want assistance.
Create backup-patches first:
dd if=/dev/disks/naa.6a4badb00fbbf00026435d4d2207a1c1:1 of=/tmp/backup-parent.bin bs=1M count=1 skip=1
dd if=/dev/disks/naa.6a4badb00fbbf00022583f81158e1c2b:1 of=/tmp/backup-extent.bin bs=1M count=1 skip=1
Command to inject the patched sections:
dd of=/dev/disks/naa.6a4badb00fbbf00026435d4d2207a1c1:1 if=/tmp/patched-parent.bin bs=1M count=1 seek=1 conv=notrunc
dd of=/dev/disks/naa.6a4badb00fbbf00022583f81158e1c2b:1 if=/tmp/patched-extent.bin bs=1M count=1 seek=1 conv=notrunc
Note for anybody reading this with similar problems: !!!
these patches were made for this special case. They are not for you !!!