VMware Cloud Community
frankdenneman
Expert
Expert
Jump to solution

Unable to upgrade replicated LUN to vmfs3 filesystem

Hi folks,

I've got a problem with upgrading a datastore from vmfs2 to vmfs3.

Here's the situation.

Two virtual machines are a part of a MSCS cluster.

The two virtual machines are hosted on a ESX 2.5.2 (build 21059) machine.

The cluster uses 3 shared disk, which are hooked up to LSI Logic SCSI controller using the Physical SCSI Bus sharing policy.

Using SCSI ID 1.0, 1.1, 1.2.

The three shared vmdks are placed on a VDISK on an EVA 5000.

The vdisk is replicated via CA to an EVA 8000.

I've cloned the two virtual machines, before cloning I removed the shared VMDKs, leaving only the system disk to be cloned.

Both cloned machines are migrated to an 3.0.1. (build 42829)

The shared vmdk's are left untouched.

I've broken the data replication group, creating two separte disks. One on the eva5000, in use by the original Virtual Machines.

The other presented to the 3.0.1 ESX server.

When I want to upgrade the VMFS datastore from VMFS 2.11 to VMFS 3.21 I get an error:

Error during the configuration of the host: Unable to Upgrade Filesystem: Read-only file system.

Before presenting the disk I changed the Write protect option from yes to no changing the Host Access into Read/Write.

Exact the same setting as each other presented disk.

When checking the /var/log/vmkwarning log I see the following error:

Aug 29 13:51:20 server vmkernel: 33:23:30:06.555 cpu0:1024)VMNIX: WARNING: VmkDev: 971: failed dsk=vmhba1:2:202:0 status=0xbad0004.

Aug 29 13:56:37 server vmkernel: 33:23:35:25.080 cpu1:1037)WARNING: FSA: 247: \[436756b3-11a1ee88-fa0e-00110a589485] Block evacuation failed: Read only

I know that the error code status=0xbad0004 means device busy. But why is it busy? It isn't presented to other ESX hosts and there is no CA replication going on.

I cannot find any threads on Block evacuation failed.

var/log/vmkernel.log states:

Aug 29 14:09:57 server vmkernel: 0:00:07:27.914 cpu2:1037)Mod: 501: mainHeap avail after: 12694272

Aug 29 14:09:57 server vmkernel: 0:00:07:27.914 cpu2:1037)Mod: 509: no private ID set

Aug 29 14:09:57 server vmkernel: 0:00:07:28.208 cpu2:1037)FS2: 8162: ** FS2Check summary **

Aug 29 14:09:57 server vmkernel: 0:00:07:28.208 cpu2:1037)FS2: 8163: Return status: Success

Aug 29 14:09:57 server vmkernel: 0:00:07:28.208 cpu2:1037)FS2: 8164: Total number of errors detected: 0

Aug 29 14:09:57 server vmkernel: 0:00:07:28.272 cpu2:1037)WARNING: FSA: 247: \[436756b3-11a1ee88-fa0e-00110a589485] Block evacuation failed: Read only

Aug 29 14:10:06 server vmkernel: 0:00:07:37.328 cpu2:1037)FSS: 301: Unregistering FS driver vmfs3 (moduleID 14)

Aug 29 14:10:06 server vmkernel: 0:00:07:37.418 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:06 server vmkernel: 0:00:07:37.424 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:06 server vmkernel: 0:00:07:37.434 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:06 server vmkernel: 0:00:07:37.440 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:06 server vmkernel: 0:00:07:37.449 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:06 server vmkernel: 0:00:07:37.460 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:06 server vmkernel: 0:00:07:37.473 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:06 server vmkernel: 0:00:07:37.574 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:06 server vmkernel: 0:00:07:37.590 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:06 server vmkernel: 0:00:07:37.651 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:07 server vmkernel: 0:00:07:37.723 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:07 server vmkernel: 0:00:07:37.746 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:07 server vmkernel: 0:00:07:37.796 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:07 server vmkernel: 0:00:07:37.806 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:07 server vmkernel: 0:00:07:37.815 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:07 server vmkernel: 0:00:07:37.837 cpu2:1037)FSS: 2319: Status bad005f getting reference on driver vmfs2

Aug 29 14:10:07 server vmkernel: 0:00:07:37.837 cpu2:1037)FSS: 301: Unregistering FS driver vmfs2 (moduleID 14)

Aug 29 14:10:07 server vmkernel: 0:00:07:38.186 cpu2:1037)FDS: 156: dummy

Aug 29 14:10:17 server vmkernel: 0:00:07:48.186 cpu6:1063)World: vm 1063: 3867: Killing self with status=0x0:Success

Aug 29 14:10:17 server vmkernel: 0:00:07:48.245 cpu0:1024)Loading module vmfs2 ...

Aug 29 14:10:17 server vmkernel: 0:00:07:48.245 cpu0:1024)Mod: 217: Starting load for module: vmfs2 R/O length: 0x11000 R/W length: 0x11000

Aug 29 14:10:18 server vmkernel: 0:00:07:48.839 cpu2:1037)Mod: 430: Module vmfs2: initFunc: 0x8f8950 text: 0x8ea000 data: 0x1fd1dd0 bss: 0x1fd1fd0

Aug 29 14:10:18 server vmkernel: 0:00:07:48.839 cpu2:1037)Mod: 446: mainHeap avail before: 12709832

Aug 29 14:10:18 server vmkernel: 0:00:07:48.839 cpu2:1037)FSS: 265: Registered fs vmfs2, module 15, fsTypeNum 0xf520

Aug 29 14:10:19 server vmkernel: 0:00:07:50.014 cpu2:1037)Mod: 471: Initialization for vmfs2 succeeded.

Aug 29 14:10:19 server vmkernel: 0:00:07:50.014 cpu2:1037)Module loaded successfully.

Aug 29 14:10:19 server vmkernel:

Aug 29 14:10:19 server vmkernel: 0:00:07:50.014 cpu2:1037)Mod: 501: mainHeap avail after: 12707920

Aug 29 14:10:19 server vmkernel: 0:00:07:50.014 cpu2:1037)Mod: 509: no private ID set

Aug 29 14:10:19 server vmkernel: 0:00:07:50.072 cpu0:1024)Loading module vmfs3 ...

Aug 29 14:10:19 server vmkernel: 0:00:07:50.072 cpu0:1024)Mod: 217: Starting load for module: vmfs3 R/O length: 0x23000 R/W length: 0x1000

Aug 29 14:10:20 server vmkernel: 0:00:07:51.327 cpu2:1037)Mod: 430: Module vmfs3: initFunc: 0x92b534 text: 0x90d000 data: 0x1fce740 bss: 0x1fceac0

Aug 29 14:10:20 server vmkernel: 0:00:07:51.327 cpu2:1037)Mod: 446: mainHeap avail before: 12704680

Aug 29 14:10:20 server vmkernel: 0:00:07:51.327 cpu2:1037)World: vm 1064: 693: Starting world FS3ResMgr with flags 1

Aug 29 14:10:20 server vmkernel: 0:00:07:51.328 cpu2:1037)FSS: 265: Registered fs vmfs3, module 16, fsTypeNum 0xf530

Aug 29 14:10:23 server vmkernel: 0:00:07:54.307 cpu2:1037)Mod: 471: Initialization for vmfs3 succeeded.

Aug 29 14:10:23 server vmkernel: 0:00:07:54.307 cpu2:1037)Module loaded successfully.

Aug 29 14:10:23 server vmkernel:

Aug 29 14:10:23 server vmkernel: 0:00:07:54.307 cpu2:1037)Mod: 501: mainHeap avail after: 12694192

Aug 29 14:10:23 server vmkernel: 0:00:07:54.307 cpu2:1037)Mod: 509: no private ID set

I checked if the vmfs2 module was loaded with vmkload_mod -l and it states:

vmfs2 0x8ea000 0x11000 0x1fd1dd0 0x11000 15 Yes

Just to be sure I rebooted the system. But no luck. The same error occurs and the datastore isn't upgraded to the VMFS 3.11 filesystem.

I can however browse the datastore and it shows the three vmdk's.

Do I need to resignature first, because the HSV are different and thus creating an new UUID?

I don't think a resignature is needed otherwise I could not browse the datastore.

Did I forget anything, what am I missing here?

Blogging: frankdenneman.nl Twitter: @frankdenneman Co-author: vSphere 4.1 HA and DRS technical Deepdive, vSphere 5x Clustering Deepdive series
0 Kudos
1 Solution

Accepted Solutions
Texiwill
Leadership
Leadership
Jump to solution

Hello,

Yes, that is there, but you must remember on ESX v2, CiB required you to put the VMFS into shared access mode. That NO LONGER exists in ESX v3. Therefore your method of doing CiB is not valid for ESX v3. Steps you should take....

Create Local SCSI VMFS-3 if one does not exist to hold C: (boot disks for all nodes)

Backup Cluster

Shutdown Cluster

Move Boot Disks to Local SCSI VMFS-3

Change VMFS from shared access mode to public

Upgrade VMFS-2 to VMFS-3

Reconfigure Cluster per documentation (starting at page 19)

However, to be safe I would use RDMs that way if you ever use a VMware Cluster you can have Nodes on multiple hosts and you can have more than 2 nodes per cluster.

Best regards,

Edward

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill

View solution in original post

0 Kudos
7 Replies
Texiwill
Leadership
Leadership
Jump to solution

Hello Frank,

VMFS-3 does not support the shared disk access mode. Therefore, upgrading that type of VMFS-2 does not work. MCSC clustering has changed drastically within in VI3. Check out http://www.vmware.com/pdf/vi3_301_201_mscs.pdf for help on this. The basics are that your shared disks now should be VMDKs, and your C: drives need to reside on local SCSI storage if you wish to support over 2 nodes.

Best regards,

Edward

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
frankdenneman
Expert
Expert
Jump to solution

Hi Edward,

Thanks for replying. None of the Virtual Machines are accessing this datastore. Which isn't possible because of the read-only nature of the vmfs2 module. I thought Shared disk access mode is an option within the virtual machine?

I'm merely trying to update the vmfs file system from vmfs2.11 to vmfs3.21.

I forgot to mention that the block size is 1MB, so that takes away the problem with the 16MB block size datastores often found in 2.X environments.

\-----

Correction, you are right, the vmfs datastore is in shared-read only mode. I used vmkfstools -P

The

Message was edited by:

Frank_D

Blogging: frankdenneman.nl Twitter: @frankdenneman Co-author: vSphere 4.1 HA and DRS technical Deepdive, vSphere 5x Clustering Deepdive series
0 Kudos
Texiwill
Leadership
Leadership
Jump to solution

Hello,

There is the bus sharing of the vSCSI device, which is set within the VM. but also the shared access mode of the VMFS itself, which is set when you add the VMFS via the MUI.

If there are NO VMDKs on this VMFS, then I would verify the mode within the MUI prior to upgrade and change from shared to public modes. If there are MCSC VMDKs on the VMFS then the VMs should be turned off and your MCSC cluster will remain broken until it is fixed using RDMs, RAW LUNs.

In ESX v2 there are lots of definitions for 'shared'.

Best regards,

Edward

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
frankdenneman
Expert
Expert
Jump to solution

Hi Edward,

I just made an correction on my previous post.

The datastore is in shared mode. The problem is that I cannot alter the original disk because the MS cluster is still running.

I can try to present the replicated vmfs to a 2.5 system, alter its access mode and then try to upgrade the datastore.

Blogging: frankdenneman.nl Twitter: @frankdenneman Co-author: vSphere 4.1 HA and DRS technical Deepdive, vSphere 5x Clustering Deepdive series
0 Kudos
Texiwill
Leadership
Leadership
Jump to solution

Hello,

Due to your MSCS you will not be able to upgrade the VMFS. You need to first recreate the MSCS within ESX v3. That will require shutting down the cluster as you convert VMDKs to RDMs and then migrate the C: drives to VI3.

Even if you can convert the VMFS using replication, you still have the problem that MCSC will not run in the mode you are using on VI3.

Best regards,

Edward

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
frankdenneman
Expert
Expert
Jump to solution

Hi

What I do not understand is why I have to use RDM. We are talking about a cluster in a box environment. When reviewing the "setup for Microsoft Cluster Service" document by VMware. It list Virtual Disk as supported when using Shared Storage for CIB's. Pass-trough RDM's and Non-pass-through RDM's are necessary when you want to implement CAB's en N+1 Clustering.

On page 24 the documents shows how to create VMDK's for a cluster in a box scenario, followed by configuring the controller into the Virtual SCSI bus sharing mode.

Blogging: frankdenneman.nl Twitter: @frankdenneman Co-author: vSphere 4.1 HA and DRS technical Deepdive, vSphere 5x Clustering Deepdive series
0 Kudos
Texiwill
Leadership
Leadership
Jump to solution

Hello,

Yes, that is there, but you must remember on ESX v2, CiB required you to put the VMFS into shared access mode. That NO LONGER exists in ESX v3. Therefore your method of doing CiB is not valid for ESX v3. Steps you should take....

Create Local SCSI VMFS-3 if one does not exist to hold C: (boot disks for all nodes)

Backup Cluster

Shutdown Cluster

Move Boot Disks to Local SCSI VMFS-3

Change VMFS from shared access mode to public

Upgrade VMFS-2 to VMFS-3

Reconfigure Cluster per documentation (starting at page 19)

However, to be safe I would use RDMs that way if you ever use a VMware Cluster you can have Nodes on multiple hosts and you can have more than 2 nodes per cluster.

Best regards,

Edward

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos