Main vmfs hosting VMs "disappeared"

pjapk · ‎04-24-2009

I've had a 3.5 lab box up & running for over a year now quite happily on an ML110 with a dedicated boot disk & a set of 4 x 500Gb disks in a local RAID-5 array running on an E200 controller for VM/Template storage.

However, recently I started getting errors on a physical SBS 2003 box which has a DC as a VM and discovered that the VMFS hosting the VMs had "disappeared". It sees the storage, but when I try to add it (in the hope that it just needs re-adding and will pick up the contents again) it says the disk is empty - i.e. no partitions on it!

Rebooting the host didn't make a difference so, after searching here, I hoped that it had simply lost it's partition table and needed re-creating. However, I'm having trouble doing this as it seems to only "half-see" the storage.

Following posts here I've tried the following, but fdisk won't play ball:

# esxcfg-vmhbadevs

vmhba1:0:0 /dev/cciss/c0d0

vmhba1:1:0 /dev/cciss/c0d1

(Sees both volumes - boot disk & RAID5 array)

# fdisk -lu /dev/cciss/c0d0

Disk /dev/cciss/c0d0: 160.0 GB, 160005980160 bytes

255 heads, 63 sectors/track, 19452 cylinders, total 312511680 sectors

Units = sectors of 1 * 512 = 512 bytes

Device Boot Start End Blocks Id System

/dev/cciss/c0d0p1 * 63 208844 104391 83 Linux

/dev/cciss/c0d0p2 208845 10442249 5116702+ 83 Linux

/dev/cciss/c0d0p3 10442250 307098539 148328145 fb Unknown

/dev/cciss/c0d0p4 307098540 312496379 2698920 f Win95 Ext'd (LBA)

/dev/cciss/c0d0p5 307098603 308207024 554211 82 Linux swap

/dev/cciss/c0d0p6 308207088 312287534 2040223+ 83 Linux

/dev/cciss/c0d0p7 312287598 312496379 104391 fc Unknown

(Sees contents of boot disk)

# fdisk -lu /dev/cciss/c0d1

#

(Doesn't see anything on second disk)

# fdisk /dev/cciss/c0d1

Unable to read /dev/cciss/c0d1

#

(Doesn't seem to want to manage disk with fdisk)

What makes it even more frustrating is that I'm currently 3500 miles away with VERY limited access! I can get a remote desktop to home in the evenings but I can't get physical access for two weeks. I have no means of checking that the RAID5 array is working, but the fact that it sees a disk of around 1.4Tb available when I go to add storage within Virtual Center says to me that the array must be functioning.

Can anyone suggest anything else I can do (remotely) to try to get this volume recognised?

Regards,

Paul

kjb007 · ‎05-14-2009

You can try resignature as RParker mentioned above. This is done through the ESX configuration tab, advanced, lvm lvm.resignature, and set to 1. Then rescan your disks. If ESX can "see" your vmfs, but can't read the metadata and is disabling access to it (evidence of this should be in the logs, which it is not), then a resignature may fix it.

-KjB

VMware vExpert

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB

pjapk · ‎05-14-2009

I had tried this previously to no avail, but just tried it again since re-creating the partition but still nothing showing.

It's probably worth pointing out that when I go to the "Storage Adaptors" section it does show the array as "SCSI Target 1" (the boot disk is SCSI Target 0) so, again, it knows of the disks existence.

kjb007 · ‎05-14-2009

Unfortunately, I am all out of ideas. It does not appear that the disk or the partitioning is at fault, at least not any more. But, it does seem that the VMFS metadata was wiped out, or at least re-located when you were seeing the failures earlier.

-KjB

VMware vExpert

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB

All