In my lab, I have a 2 node vSAN witness cluster (ESXi and VCSA 6.7 Update 2) using all HDDs.
Witness appliance was updated to match same ESXi version.
I am creating 3 disk groups (1 cache, 7 capacity) per node.
The system creates all disk groups but with errors. One disk group is healthy but the other two are not.
Going through the /var/log/vsansystem.log says that the system failed to create a Virsto file system on a drive, stating that 'operation was not permitted'.
Running esxcli storage core device smart get -d naa.############## shows no errors on the drive. The drive also shows the proper partitioning table as all the other drives.
What would cause the system not to have permissions to write a file system on a drive? And, how do I resolve this?
The current result is that the disk groups get created (with errors) and only 1/3 of my datastore size is available for use. This is not very useful if I want to implement it in a production environment.
Your help is greatly appreciated.
This issue can be closed. I was able to resolve this on my own before any posts were made.
Per KB-65146, the underlying cause was due to using a large cache tier drive (10TB) when creating the vSAN disk group.
To resolve the issue, I ran 'esxcfg-advcfg -s 2047 /LSOM/heapSize' on each node and rebooted them.
Afterward, I was able to create the vSAN disk groups with no errors.
Hello rayquaza
"I have a 2 node vSAN witness cluster "
Do you mean a 2-node Stretched/DirectConnect cluster + a Witness?
"using all HDDs."
Not possible unless you went far into unsupported and marked some HDDs as flash devices, do you mean it is a Hybrid configuration? (e.g. SSD/NVMe as Cache-tier and HDDs as Capacity-tier)
"What would cause the system not to have permissions to write a file system on a drive? And, how do I resolve this?"
The devices could have become read-only or if this is a nested/micro setup maybe they had insufficient memory to create/mount the Disk-Groups.
"The current result is that the disk groups get created (with errors) and only 1/3 of my datastore size is available for use. "
Please elaborate on the configuration of this cluster e.g. is this a nested lab or is it all metal servers with all relevant components (SSDs, HDDs and controllers) on the vSAN HCL?
"This is not very useful if I want to implement it in a production environment."
If this is a production environment then please open a case with GSS so that we can properly investigate the source of the issue.
If it is a homelab then please provide a lot lot more information and do the necessary due diligence of narrowing down the problem area (e.g. are the disks groups created, check #vdq -Hi ,are the disks in CMMDS , do the Disk-Groups get mounted at boot or are they failing with LSOM Out of memory, what is vmkernel.log saying? etc.).
Bob
This issue can be closed. I was able to resolve this on my own before any posts were made.
Per KB-65146, the underlying cause was due to using a large cache tier drive (10TB) when creating the vSAN disk group.
To resolve the issue, I ran 'esxcfg-advcfg -s 2047 /LSOM/heapSize' on each node and rebooted them.
Afterward, I was able to create the vSAN disk groups with no errors.