VMware Cloud Community
atepic
Contributor
Contributor

vSAN Datastore Appears Read Only.

So, I have the same issue here. is there a solution for this, I cant create anything at vsan datastore

[root@esx03:~] cmmds-tool find -t NODE_DECOM_STATE

owner=5f539fbe-1bb0-3c87-1564-0015170e87d0(Health: Healthy) uuid=5f539fbe-1bb0-3c87-1564-0015170e87d0 type=NODE_DECOM_STATE rev=0 minHostVer=0  [content = (i0 i0 UUID_NULL i0 [ ] i0 i0 i0 "" i0 i0 l0 l0)], errorStr=(null)

owner=5f36a631-7e3c-a3a0-b92b-000af71214c6(Health: Healthy) uuid=5f36a631-7e3c-a3a0-b92b-000af71214c6 type=NODE_DECOM_STATE rev=0 minHostVer=0  [content = (i0 i0 UUID_NULL i0 [ ] i0 i0 i0 "" i0 i0 l0 l0)], errorStr=(null)

owner=5f36b26f-d4a9-3ed3-f559-001018f6a140(Health: Healthy) uuid=5f36b26f-d4a9-3ed3-f559-001018f6a140 type=NODE_DECOM_STATE rev=0 minHostVer=0  [content = (i0 i0 UUID_NULL i0 [ ] i0 i0 i0 "" i0 i0 l0 l0)], errorStr=(null)

[root@esx03:~] esxcli vsan cluster get

Cluster Information

   Enabled: true

   Current Local Time: 2020-09-07T12:40:00Z

   Local Node UUID: 5f36b26f-d4a9-3ed3-f559-001018f6a140

   Local Node Type: NORMAL

   Local Node State: AGENT

   Local Node Health State: HEALTHY

   Sub-Cluster Master UUID: 5f539fbe-1bb0-3c87-1564-0015170e87d0

   Sub-Cluster Backup UUID: 5f36a631-7e3c-a3a0-b92b-000af71214c6

   Sub-Cluster UUID: 52144540-55f2-0a1e-48b5-0b140b5a89b6

   Sub-Cluster Membership Entry Revision: 4

   Sub-Cluster Member Count: 3

   Sub-Cluster Member UUIDs: 5f539fbe-1bb0-3c87-1564-0015170e87d0, 5f36a631-7e3c-a3a0-b92b-000af71214c6, 5f36b26f-d4a9-3ed3-f559-001018f6a140

   Sub-Cluster Member HostNames: 10.home, esx01.lab.local, esx03.lab.local

   Sub-Cluster Membership UUID: 7b26565f-a2e3-c6a1-0cea-0015170e87d0

   Unicast Mode Enabled: true

   Maintenance Mode State: OFF

   Config Generation: 30c4ac33-086f-4164-8bc1-a092fec269b7 10 2020-09-07T12:32:08.412

[root@esx03:~] df -h

Filesystem   Size   Used Available Use% Mounted on

VMFS-6     465.0G   3.4G    461.6G   1% /vmfs/volumes/local spindle 23

VMFS-L       2.8G   1.7G      1.0G  62% /vmfs/volumes/LOCKER-5f36b336-bf95cab0-298c-001018f6a140

vfat       499.7M 192.1M    307.6M  38% /vmfs/volumes/BOOTBANK1

vfat       499.7M 192.1M    307.6M  38% /vmfs/volumes/BOOTBANK2

vsan         2.7T  38.9G      2.7T   1% /vmfs/volumes/vsanDatastore

Reply
0 Kudos
8 Replies
TheBobkin
Champion
Champion

Hello atepic

Welcome to Communities.

"So, I have the same issue here."

Not necessarily - there can be a range of causes for what can on the surface be thought of as the same issue, this doesn't make the cause the same - I would suggest you raise this as its own new question and/or ask a mod to split your question into its own thread so as to avoid confusion.

In this you should also provide more verbose information about your setup, what you have tried and the relevant error messages (e.g. you need to help us help you as we are oblivious when you provide insufficient information).

Bob

Reply
0 Kudos
depping
Leadership
Leadership

Branched as requested

Reply
0 Kudos
lucasbernadsky
Hot Shot
Hot Shot

Hi there! I had a similar problem and it worked for me:

Find vSAN Cluster master host. In your case it would be the one with UUID 5f539fbe-1bb0-3c87-1564-0015170e87d0.

Enter the master host in maintenance mode and reboot it (Ensure Availability would be enough in this case since it will be a quick reboot).

Exit maintenance mode and try to write something in vSAN datastore

Reply
0 Kudos
TheBobkin
Champion
Champion

Hello atepic​,

So, to summarise what this is not:

- Hosts not in MM/vSAN Decom state.

- Cluster not partitioned (or at least it wasn't when they ran cluster get)

- Not a capacity issue as we see vsanDatastore is of size - OP please confirm ~2.7TB is the expected size for the vsanDatastore e.g. all 3 nodes have Disk-Groups comprised of ~1TB Capacity-tier devices.

What this leaves (barring something I can't think of right now or extreme corner-cases):

- Maybe Only one/two nodes have Disk-Groups and they are failing to create Default RAID1 Objects (as this requires a minimum of 3 nodes for component placement).

- OP is trying to create RAID5 Objects and doesn't have 4 nodes with storage (if this did have 4 nodes and was failing I would check vSAN licensing is at least Advanced and that it is indeed All-Flash and configured as such).

- There is some issue with VASA/sps/client.

- All nodes have incorrectly been placed in a single configured Fault Domain.

atepic, is the vCenter in use here on the same or higher build than the hosts? e.g. ESXi in use is 6.7 U3 and vCenter is also 6.7 U3

Do all 3 nodes have a Disk-Group configured, healthy and in CMMDS or just one/two?

Have Fault Domains been configured and if so, are all/2 hosts in one FD? If so, take them all out - configuring FDs is pointless in a 3-node cluster as each host is by default a logical FD when not configured.

What Storage Policy (SP) are you trying to create VMs/vmdk with? (e.g. what are its defined rules - NOT just its name)

What SP is assigned as the vsanDatastore default?:

datastores > vsanDatastore > Configure > Default Storage Policy

This SP is the one used during Proactive VM creation test which is step 2 (after getting Health Green) for identifying issues once the cluster has been created:

Cluster > Monitor vSAN > Proactive tests > Proactive VM creation Test > Run

If the above fails it *generally* will indicate why e.g. need 3 FDs, found 1.

Test if you can create an FTT=0 Object via the UI (by creating such an SP) or via the CLI:

# cd /vmfs/volumes/vsanDatastore

# /osfs/bin/osfs-mkdir -n test123

Bob

Reply
0 Kudos
atepic
Contributor
Contributor

So, lets start:

-Do all 3 nodes have a Disk-Group configured, healthy and in CMMDS or just one/two?

  esx02.lab.local2 of 2ConnectedHealthyGroup 1

Disk group (52da07bc-ffce-4b84-65d2-67fe52482674)2MountedHealthyHybrid11

  esx03.lab.local2 of 2ConnectedHealthyGroup 1

Disk group (525a528e-fde9-d6f7-2b7c-d7b40d734fe7)2MountedHealthyHybrid11

  esx01.lab.local2 of 2ConnectedHealthyGroup 1

Disk group (52757776-ef3f-1096-aa69-482b7e525239)2MountedHealthyHybrid11

-Have Fault Domains been configured and if so, are all/2 hosts in one FD? If so, take them all out - configuring FDs is pointless in a 3-node cluster as each host is by default a logical FD when not configured.

--only one Fault Domain exists

-What Storage Policy (SP) are you trying to create VMs/vmdk with? (e.g. what are its defined rules - NOT just its name)

-- I tried to create VM with default storage policy and whit custom made policy. Result is the same.

-What SP is assigned as the vsanDatastore default?:

-- default Policy is: vSAN Default Storage Policy

datastores > vsanDatastore > Configure > Default Storage Policy

-This SP is the one used during Proactive VM creation test which is step 2 (after getting Health Green) for identifying issues once the cluster has been created:

Cluster > Monitor vSAN > Proactive tests > Proactive VM creation Test > Run

--

esx03.lab.local

errorOperation failed, diagnostics report: vsan-healthcheck-disposable-09-08-1946-34-esx03.lab.local (Cannot Create File)

esx01.lab.local

errorOperation failed, diagnostics report: vsan-healthcheck-disposable-09-08-1946-34-esx01.lab.local (Cannot Create File)

esx02.lab.local

errorOperation failed, diagnostics report: vsan-healthcheck-disposable-09-08-1946-34-esx02.lab.local (Cannot Create File)

If the above fails it *generally* will indicate why e.g. need 3 FDs, found 1.

- Test if you can create an FTT=0 Object via the UI (by creating such an SP) or via the CLI:

# cd /vmfs/volumes/vsanDatastore

# /osfs/bin/osfs-mkdir -n test123

--

[root@esx01:/var/log] /usr/lib/vmware/osfs/bin/osfs-mkdir /vmfs/volumes/vsan:5214454055f20a1e-48b50b140b5a89b6/folderName1

Failed.  Search vmkernel log and osfsd log for opID 'osfsIpc-1599596137.359'.

[root@esx01:/var/log] cat vmkernel.log | grep osfsIpc-1599596137.359

2020-09-08T20:15:37.385Z cpu22:1204902 opID=701b8d2f)World: 12458: VC opID osfsIpc-1599596137.359 maps to vmkernel opID 701b8d2f

[root@esx01:/var/log] cat vmkernel.log | grep 701b8d2f

2020-09-08T20:15:37.385Z cpu22:1204902 opID=701b8d2f)World: 12458: VC opID osfsIpc-1599596137.359 maps to vmkernel opID 701b8d2f

2020-09-08T20:15:37.385Z cpu22:1204902 opID=701b8d2f)LVM: 13107: LVMProbeDevice failed with status "Device does not contain a logical volume".

2020-09-08T20:17:37.409Z cpu22:1204902 opID=701b8d2f)FSS: 2379: Failed to create FS on dev [69e6575f-ea45-5467-22a9-000af71214c6] fs [69e6575f-ea45-5467-22a9-000af71214c6] type [vmfs3] fbSize 1048576 => Timeout

additional info:

all three host have controller which is not certified by VMware

Host  esx03.lab.local

Device vmhba1: Broadcom PERC H710 Adapter

Current ESXi release ESXi 7.0

Release supported: yellow triangle

Certified ESXi releases ESXi 6.0 U3, ESXi 6.0 U2, ESXi 6.0 U1, ESXi 6.0, ESXi 5.5 U3, ESXi 5.5 U2, ESXi 5.5 U1

BR
Alex
Reply
0 Kudos
TheBobkin
Champion
Champion

Hello Alex

"--only one Fault Domain exists"

None should be configured e.g. the nodes should not be in any Fault Domains.

Do you by any chance have jumbo frames configured on the vSAN-enabled vmk where the physical network doesn't support it?

If so then lower the MTU on all vSAN-enabled vmk to 1500.

Regarding H710 controllers - this is okay if this is a Homelab but not okay if it is a Production cluster - supported controllers are not too expensive.

Edit: what's with the weird host02 FQDN?:

Sub-Cluster Member HostNames: 10.home, esx01.lab.local, esx03.lab.local

Bob

Reply
0 Kudos
atepic
Contributor
Contributor

Sorry for wrong information. You are right, there isn`t any fault domain configured.

MTU is set to 9000 at vmkernel port, virtual and phisical switch level. That is ok, vSAN: MTU check (ping with large packet size) is green.

Host name is different by mistake, I was reinstalling server and didn't set ti correctly after that.

When I run VM Creation Test it start with creating of VM but it stuck at 6% Reserving folder on host,  and output error is

Cannot complete file creation operation. Operation failed, diagnostics report: vsan-healthcheck-disposable-09-08-2119-26-esx02.lab.local (Cannot Create File)

In my cluster I have only one warning and that is "No" in "Is Stats Master":

Host

esx03.lab.local

esx01.lab.local

esx02.lab.local

Is CMMDS Master

No

No

Yes

Is Stats Master

No

No

No


BR.

Alex

Reply
0 Kudos
TheBobkin
Champion
Champion

Hello Alex,

There is no likely no Stats Master as there is no healthy stats Object for any possible Stats Master to be owner of and write to (as the cluster can't create Objects).

What build are the vCenter and hosts on here?

What vSAN license is applied and is it current? (e.g. not expired)

Health checks aside, if you could humour me and test the ping from and to each node from each node:

# vmkping -I vmkX -s 8972 -d <DestIP> -c 10

(where vmkX = the vmk configured for vSAN on source node and <DestIP> being the vSAN vmk IP on the destination node - these can be retrieved from #esxcli vsan network list , #esxcli vsan cluster unicastagent list , or #esxcfg-vmknic -l)

If this is clean (e.g. 0% packet loss) and doesn't have something problematic like high latency then please try running proactive VM creation test and then get vmkernel.log, clomd.log and vobd.log from each host and attach them here or PM them to me to have a look.

Bob

Reply
0 Kudos