VMware Cloud Community
arthuravet
Contributor
Contributor

How to fix disappearing vmfs3 Datastores in ESX 3.5

Yesterday we had a problem when Datastores became unvisible to all ESX 3.5 in our farm. Though luns where still visible from all ESX's HBAs.

I've searched over Internet and did not find how to fix my problem. Trick with enabling LVM.Resignature did not help. No errors about snapshots appeared in /var/log/vmkernel.

The solution was brought to us by VMWare support team, and i'm putting it here just for other people who meet this problem (because the solution has a little trick that you will not find out by yourself).

So this is the magic sequence of commands (x346-10 is one of ESXes 3.5 and /dev/sdn is one of disappeared disks, which seemed not to contain any partitions):

-


# /sbin/fdisk /dev/sdn

The number of cylinders for this disk is set to 109222.

There is nothing wrong with that, but this is larger than 1024,

and could in certain setups cause problems with:

1) software that runs at boot time (e.g., old versions of LILO)

2) booting and partitioning software from other OSs

(e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): p

Disk /dev/sdn: 898.3 GB, 898388459520 bytes

255 heads, 63 sectors/track, 109222 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

Command (m for help): n

Command action

e extended

p primary partition (1-4)

p

Partition number (1-4): 1

First cylinder (1-109222, default 1):

Using default value 1

Last cylinder or +size or +sizeM or +sizeK (1-109222, default 109222):

Using default value 109222

Command (m for help): t

Selected partition 1

Hex code (type L to list codes): fb

Changed system type of partition 1 to fb (Unknown)

Command (m for help): p

Disk /dev/sdn: 898.3 GB, 898388459520 bytes

255 heads, 63 sectors/track, 109222 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/sdn1 1 109222 877325683+ fb Unknown

Command (m for help): w

The partition table has been altered!

Calling ioctl() to re-read partition table.

Syncing disks.

# /sbin/fdisk -u /dev/sdn

The number of cylinders for this disk is set to 109222.

There is nothing wrong with that, but this is larger than 1024,

and could in certain setups cause problems with:

1) software that runs at boot time (e.g., old versions of LILO)

2) booting and partitioning software from other OSs

(e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): x

Expert command (m for help): b

Partition number (1-4): 1

New beginning of data (63-1754651429, default 63): 128

Expert command (m for help): w

The partition table has been altered!

Calling ioctl() to re-read partition table.

Syncing disks.

-


After that, rescan storage adapters using VMware infractructure Client and lost Datastore reappears!

0 Kudos
2 Replies
Texiwill
Leadership
Leadership

Hello,

Look for SCSI and related errors within the file /var/log/vmkernel. You may be experience LUN failover.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

SearchVMware Blog: http://itknowledgeexchange.techtarget.com/virtualization-pro/

Blue Gears Blogs - http://www.itworld.com/ and http://www.networkworld.com/community/haletky

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
arthuravet
Contributor
Contributor

No, there is no problem with failover or multipathing - they were and are working fine. This was just some bug from VMWare that is resolved now and above is the solution. The aim of this discussion is to make this solution public.

0 Kudos