VMware Cloud Community
Rampage
Contributor
Contributor

ESXi datastore missing after swapping RAID controllers

Morning All,

Hope someone can help...

I swapped RAID controllers from an LSI 9261 to a Dell Perc H700 to test (essentially same controller though different FW). On switching back to the original controller I was prompted to import a foreign config which it did....

ESXI 6 starts though no persistent storage is found and as such no datastore. The previous VM's are listed under the host as /vmfs/volumes/<uid>

Under - Configuration/Devices the two logical partitions are shown - the 160Gb boot and where the datastore was and a 5tb RDM.

Using Search for datastores completes in seconds with no positive result

Any thing which can be done to get the datastore back?

much appreciated

-Steve

21 Replies
Techie01
Hot Shot
Hot Shot

Try some of the following and see how this goes. Also, post the outputs.

1. Try to create a new VMFS volume, by going into ( configuraiton -- storage -- add storage window)   ( but dont proceed with vmfs creation). Does the 160GB lun showing up as free VMFS in the list of available devices?

2. Post the output of "partedUtil getptbl /vmfs/devices/disks/naa.xxxx"  ( replace xxx with the devices naa id)

3. Run the command esxcfg-volume -l  . Check if your volume is showing in this as a snapshot

4. Run a rescan command

Rampage
Contributor
Contributor

Many thanks

The results are

1) Yes the 160Gb is showing : VMFS Label - Datastore(head)

2) partedUtil getptbl /vmfs/devices/disks/naa.600605b0072f3a40ff00005305266396

gpt

19389 255 63 311492608

1 64 8191 C12A7328F81F11D2BA4B00A0C93EC93B systemPartition 128

5 8224 520191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

6 520224 1032191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

7 1032224 1257471 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0

8 1257504 1843199 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

9 1843200 7086079 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0

2 7086080 15472639 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

3 15472640 311492574 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

3) esxcfg-volume -l

VMFS UUID/label: 5391393b-f67ade3f-52c0-3ca82aa01f8c/Datastore

Can mount: Yes

Can resignature: Yes

Extent name: naa.600605b0072f3a40ff00005305266396:3     range: 0 - 144383 (MB)

4) Rescan for Datastore results in no change.

Many thanks

-Steve

0 Kudos
FritzBrause
Enthusiast
Enthusiast

Please have a look into vmkernel.log and check if you find those entries:

LVM: 8445: Device eui.0017380012020364:1 detected to be a snapshot:
LVM: 8452: queried disk ID: <type 1, len 17, lun 36, devType 0, scsi 0, h(id) 7683208289187576905>
LVM: 8459: on-disk disk ID: <type 1, len 17, lun 17, devType 0, scsi 0, h(id) 7683208289187576905>


Then it would mean the VMFS is seen on the LUN, but the VMFS signature on the datastore is different from the new physical device path.

If you find the above in the logs, you can force-mount the datastore.


You can also run this command to check for snapshots:

esxcli storage vmfs snapshot list


If the datastore is detected as snapshot, you can force-mount with esxcfg-volume -m / -M (see VMware KB: vSphere handling of LUNs detected as snapshot LUNs).



0 Kudos
Rampage
Contributor
Contributor

Many thanks:

Have this:

LVM: 10060: Device naa.600605b0072f3a40ff00005305266396:3 detected to be a snapshot:

LVM: 10067:   queried disk ID: <type 2, len 22, lun 0, devType 0, scsi 0, h(id) 2738791488147918849>

LVM: 10074:   on-disk disk ID: <type 2, len 22, lun 0, devType 0, scsi 0, h(id) 15764991659952480985>

esxcli storage vmfs snapshot list

5391393b-f67ade3f-52c0-3ca82aa01f8c

   Volume Name: Datastore

   VMFS UUID: 5391393b-f67ade3f-52c0-3ca82aa01f8c

   Can mount: true

   Reason for un-mountability:

   Can resignature: true

   Reason for non-resignaturability:

   Unresolved Extent Count: 1

0 Kudos
Rampage
Contributor
Contributor

The article references:

vSphere Client

  1. Log in to the vSphere Client and select the server from the inventory panel.
  2. In the Hardware panel of Configuration tab, click Storage.
  3. Click Add Storage.
  4. Select the Disk/LUN storage type.
  5. Click Next.
  6. From the list of LUNs, select the LUN that has a datastore name displayed in the VMFS Label column.

    Note: The name present in the VMFS Label column indicates that the LUN is a copy that contains a copy of an existing VMFS datastore.


  7. Click Next.
  8. Under Mount Options, these options are displayed:

    • Keep Existing Signature: Persistently mount the LUN (for example, mount LUN across reboots)
    • Assign a New Signature: Resignature the LUN
    • Format the disk: Reformat the LUN

      Notes:
      • Format the disk option deletes any existing data on the LUN.
      • Before attempting to resignature, ensure that there are no virtual machines running off that VMFS volume on any other host, as those virtual machines become invalid in the vCenter Server inventory and they are to be registered again on their respective hosts.

  9. Select the desired option for your volume.
  10. In the Ready to Complete page, review the datastore configuration information.
  11. Click Finish.

Should I use - Keep Existing Signature or Assign New Signature?

Thanks again

Techie01
Hot Shot
Hot Shot

Ok. Looks like you have your data in-tact.  use keep the existing signature option and mount it.  Since you dont have any other VMFS mounted, there is no question of UUId clash.

0 Kudos
Techie01
Hot Shot
Hot Shot

BTW, do let us know the final outcome.

0 Kudos
Rampage
Contributor
Contributor

Awesome worked like a treat 😄

Next problem though... for the VM which is linked to the RDM volume it fails with:

Cannot open the disk '/vmfs/volumes/5391393b-f67ade3f-52c0-3ca82aa01f8c/RAID5_RDM/RAID5.vmdk' or one of the snapshot disks it depends on.

If I browse the datastore I can see VMDK linked to the VM - guessing the UID\Signature of the volume is different?

This contains 5Tb of data so really don't to loose it.

cheers again!

0 Kudos
Rampage
Contributor
Contributor

This is the VMDK if it helps:

# Disk DescriptorFile

version=1

encoding="UTF-8"

CID=58a77055

parentCID=ffffffff

isNativeSnapshot="no"

createType="vmfsPassthroughRawDeviceMap"

# Extent description

RW 11718885376 VMFSRDM "RAID5-rdmp.vmdk"

# The Disk Data Base

#DDB

ddb.adapterType = "lsilogic"

ddb.geometry.cylinders = "729466"

ddb.geometry.heads = "255"

ddb.geometry.sectors = "63"

ddb.longContentID = "4867a536487b23028ed2ba3258a77055"

ddb.uuid = "60 00 C2 97 08 14 28 97-52 43 1d 3a c4 65 4d cb"

ddb.virtualHWVersion = "11"

0 Kudos
chaithu4u
Enthusiast
Enthusiast

Can you please go through below vmware article

VMware KB: Datastore missing after rebuilding the RAID disk/LUN

0 Kudos
Techie01
Hot Shot
Hot Shot

Are you able to see the 5TB LUN through ESX?  Make sure the lun is visible to ESX first.

If you do "ls -l /vmfs/volumes/5391393b-f67ade3f-52c0-3ca82aa01f8c/RAID5_RDM/RAID5.vmdk" are you able to see the output?

0 Kudos
Rampage
Contributor
Contributor

Thanks...

Yes:-

[root@localhost:/vmfs/volumes] ls -l /vmfs/volumes/5391393b-f67ade3f-52c0-3ca82aa01f8c/RAID5_RDM/RAID5.vmdk

-rw-------    1 root     root           498 Jun 11  2014 /vmfs/volumes/5391393b-f67ade3f-52c0-3ca82aa01f8c/RAID5_RDM/RAID5.vmdk

[root@localhost:/vmfs/volumes]

pastedImage_0.png

0 Kudos
Techie01
Hot Shot
Hot Shot

Not sure if the issue is resolved. If not, try running below command for that esx and share the output.

"vmkfstools -D /vmfs/volumes/5391393b-f67ade3f-52c0-3ca82aa01f8c/RAID5_RDM/RAID5-rdmp.vmdk"

0 Kudos
Rampage
Contributor
Contributor

Thanks you again... still getting

Failed to start the virtual machine.

Module Disk power on failed.

Cannot open the disk '/vmfs/volumes/5391393b-f67ade3f-52c0-3ca82aa01f8c/RAID5_RDM/RAID5.vmdk' or one of the snapshot disks it depends on.

19 (No such device)

[root@localhost:/vmfs/volumes] vmkfstools -D /vmfs/volumes/5391393b-f67ade3f-52c0-3ca82aa01f8c/RAID5_RDM/RAID5-rdmp.vmdk

Lock [type 10c00001 offset 240533504 v 161, hb offset 3690496

gen 159, mode 0, owner 00000000-00000000-0000-000000000000 mtime 654

num 0 gblnum 0 gblgen 0 gblbrk 0]

Addr <4, 570, 56>, gen 141, links 1, type rdm, flags 0, uid 0, gid 0, mode 600

len 6000069312512, nb 0 tbz 0, cow 0, newSinceEpoch 0, zla 0, bs 512

0 Kudos
Techie01
Hot Shot
Hot Shot

Ok . Looks like the problem is related to finding the correct file.

>>

19 (No such device)

>>


First check if you are able to see both map file and data file.


ls -l /vmfs/volumes/5391393b-f67ade3f-52c0-3ca82aa01f8c/RAID5_RDM/RAID5.vmdk

ls -l /vmfs/volumes/5391393b-f67ade3f-52c0-3ca82aa01f8c/RAID5_RDM/RAID5-rdmp.vmdk

If you are able to see the output, there is a high chance that you are seeing some sort of stale info.

Since RDM data is on the lun, go to edit setting of the VM, remove the RDM from the configuration of the VM. Make sure you are selecting the RDM VMDK. Do not select, " delete the files" option .

After this rescan the esxi

Check that the 5TB LUN is still visible

Add that VMDk again as a RDM to the same VM


0 Kudos
Rampage
Contributor
Contributor

Cheers... so latest is:

Checked and both files are there.

Removed HDD from the VM

Re-scanned

Re-add, VM > Edit Settings, HDD > browse the datastore, to RAID5_RDM - nothing listed

via

Host > Summary > Browse datastore to RAID5_RDM files there

via

SSH - ls > files are there.

Could it be the uid for the physical drive has changed? and the RDM vmdk reference is pointing to the an old uid?

[root@localhost:/dev/disks] ls

naa.600605b0072f3a40ff00005305266396 <- 160Gb

naa.600605b0072f3a40ff000054052aaa63 <- 5Tb

would recreating the RDM vmdk work without loosing the data, as I am guessing vmware doesn't initialize anything on the physical HDD??

vmkfstools -z /vmfs/devices/disks/naa.600605b0072f3a40ff000054052aaa63 /vmfs/volumes/RAID5_RDM/RAID5RDM.vmdk

0 Kudos
Rampage
Contributor
Contributor

Have everything back up and no data lost (fingers crossed). This was acheived by trashing and re-installing ESXI 6 and starting from scrtaching - re-deploying the VM's from ovfs and creating a new RDM vmdk.

Not sure why but on trying vmkfstools to create the RDM prior to re-install it errored with device could not be found or is not a disk, though the uid was listed under dev/disks...

Thanks to all

-Steve

0 Kudos
TERIAlexK
Contributor
Contributor

Totally worked! Thank you!!! ... too bad this is not integrated in v6.5 yet.

0 Kudos
tom11011
Contributor
Contributor

Hello, I wanted to add something to this post as an outsider who had the same problem.

I my case, the server was a Dell Poweredge R720 running esxi 6.7 U3.  It had a Perc H310 controller in it.  If you don't know about this controller, it has no battery so it is not capable of 'write back' caching.  We decided to upgrade it to the H710P which has cache.

Anyway, the upgrade went ok, the server booted and esxi booted without issue.  But the datastore was missing.

I stumbled across this post and tried a few of the items without luck.  Ended up opening a support case with vmware and what we found was the datastore was in fact being seen as a snapshot.

Ran the following-

from the cli determined the UUID of Datastore1 > esxcfg-volume -l

Mounted using the command > esxcfg-volume -M uuid

After that, the virtual machines and datastore appeared.

Just to be sure it was a permanent change, the server was rebooted and came back up ok.

Hopefully this helps someone.