VMware Cloud Community
vRon
Contributor
Contributor

Skyline Health: HA fails when same NFS share is mounted multiple times using different IP address

vCenter/Skyline Health complains for all hosts.

* points to "https://kb.vmware.com/s/article/81785" which is useless

  * no of those mentioned messages can be found in any hosts vmkwarning.log-file.

 

Skyline Health unfortunately doesn't provides any hint:

* about which datastore is the root-cause of the alarm.

 

 

I have been removing all but one datastores from the Services/vSphere-DRS Heartbeat Datastores

=> the alarm still persists

 

Any hint?

 

..current vCenter 7, esxi 6.7 and 7.0 hosts affected.

 

0 Kudos
3 Replies
depping
Leadership
Leadership

what is the exact error you see?

0 Kudos
vRon
Contributor
Contributor

it's in Skyline-Health:

 

vRon_0-1673261310238.png

...

vRon_1-1673261359617.png

 

...

vRon_2-1673261380821.png

 

 

"Ask VMware" leads to the IMHO useless https://kb.vmware.com/s/article/81785

 

Each ESXi-Host has several NFS-Datastores mounted - but they are absolutely unique, there shouldn't be any overlap regarding their filesystems...

 

0 Kudos
vRon
Contributor
Contributor

double checked: there _is_ no overlap between our NFS-shares.

Figured out, that conflicts in 2 groups of our shares lead to the conflict:

  • group 1 - 3x independent shares named "...something...esx01"
    • mount any 2 of those 3 and Skyline-Health complains about possible HA-issues
  • group 2 - 2x independend shares named "...something...esx02"
    • mount both and Skyline-Health complains

 

=> renaming the datastore name didn't solve the issue [good!]

 

Looking deeper into the shares:

[root@esxsrv:/var/log] esxcli storage nfs list
Volume Name                    Host          Share          Accessible  Mounted  Read-Only   isPE  Hardware Acceleration
-----------------------------  ------------  -------------  ----------  -------  ---------  -----  ---------------------
mc1nfs21_esx02                 172.16.0.21  /esx02               true     true      false  false  Not Supported
mc1nfs21_esx01                 172.16.0.21  /esx01               true     true      false  false  Not Supported
mc1nfs11_esx02                 172.16.0.11  /esx02               true     true      false  false  Not Supported
mc1nfs11_esx01                 172.16.0.11  /esx01               true     true      false  false  Not Supported
fas1nfs01_Software_Repository  172.16.0.9   /esx_softrepo        true     true      false  false  Not Supported
fas1nfs01_esx01                172.16.0.1   /esx01               true     true      false  false  Not Supported

 

So my conclusion is:

  • Skyline-Health doesn't perform a complex health-check for this to figure out real conflicts - as the KB suggests

It just looks at the "Share"-Name:

  • finding here 3x mounts of anything with the same share-name "/esx01" at different NFS-Servers

 

Which is in its embarrassingly simplicity absolutely usesless here => IMHO more a bug than a feature.

 

Waste of time.

0 Kudos