VMware Cloud Community
Ian2498
Enthusiast
Enthusiast

No vCLS VM's After Upgrade to 7.0 U1a

Hi, I've just upgraded my vSphere vCenter Appliances from 6.5 to 7.0 U1a and I am seeing the following error on each cluster node in the environment after the upgrade:

"vSphere DRS functionality was impacted due to unhealthy state vSphere Cluster Services caused by the unavailability of vSphere Cluster Service VMs. vSphere Cluster Service VMs are required to maintain the health of vSphere DRS"

The upgrade was successful and no error messages presented during the upgrade process.

My environment is in linked mode with 2 vCenter servers.  Clusters on both vCenter servers are affected. 

In VM’s & Templates I see an empty folder labelled vCLS.

I have tried creating a new cluster, moving a host to the new cluster & turning on DRS however this doesn’t generate any vCLS VM’s.  I can’t see any vCLS file info in any of the existing local or iSCSI datastores. 

Has anyone else come across this issue?

Thanks in advance!

Reply
0 Kudos
14 Replies
depping
Leadership
Leadership

what you can try is enabling retreat mode and disabling it on your cluster. this should trigger the provisioning of the vCLS VMs. I created a demo how to do this and shared it here: http://www.yellow-bricks.com/2020/10/27/demo-time-how-to-delete-the-vcls-vms/ 

I also posted more details on vCLS here: http://www.yellow-bricks.com/2020/10/09/vmware-vsphere-clustering-services-vcls-considerations-quest... 

Do note, vCLS VMs will be provisioned on any of the available datastores when the cluster is formed, or when vCenter detects the VMs are missing.  The VMs are not visible in the "hosts and clusters" view, but should be visible in the "VM and templates" view of vCenter Server

Reply
0 Kudos
Ian2498
Enthusiast
Enthusiast

Thanks for the info Duncan.  I will raise a change request & try enabling & disabling retreat mode.

Reply
0 Kudos
depping
Leadership
Leadership

Do note, while you disable/enable it, DRS does not work temporarily

Reply
0 Kudos
Ian2498
Enthusiast
Enthusiast

Hi Duncan, I have tested retreat mode in my test lab and this worked exactly as described.  Unfortunately when enabling & disabling retreat mode in the customer environment the vCLS VM's are not generated when changing the value to true.  I don't see any activity & the vCLS folder under VM’s & Templates remains empty.  There is nothing generated in recent tasks when enabling or disabling retreat mode. 

The only error I can see is on the cluster in cluster-name>monitor>events> ‘Cluster Agent VM is missing on cluster [Cluster Name] (vCLS)’ and this is repeated every 30 seconds.

Looking at other community posts it seems I’m experiencing a very similar issue to the cluster issue reported by Marcin4.  The main difference being my affected clusters are not vSAN clusters.  

Just to confirm this is a production environment & all of the hardware is on the HCL.  The customer is using Gen 14 Dell servers (R640’s) and I checked all I/O devices are supported before upgrading to 7.0 U1a.

I have raised a support request and will send updated log info.

Reply
0 Kudos
depping
Leadership
Leadership

Can you provide me with the SR? I would like to keep an eye on it just to figure out what is happening and potentially write a post about it to help other customers.

Reply
0 Kudos
depping
Leadership
Leadership

Are the hosts also running 7.0 U1? Or are they still at 6.5?

Reply
0 Kudos
Ian2498
Enthusiast
Enthusiast

Sure, the SR is 20171581711 & all hosts are still at 6.5.

Reply
0 Kudos
SrVMoussa
VMware Employee
VMware Employee

Resetting STS 😉 

Regards,
Khalid Moussa
Reply
0 Kudos
fojtp
Contributor
Contributor

the same problem and STS is valid

Reply
0 Kudos
Ian2498
Enthusiast
Enthusiast

Running the fixsts.sh script by Luciano Delorenzi resolved an STS issue in my environment which was the root cause of the vCLS issue.

fojtp
Contributor
Contributor

In our case (after fixsts) it was necessary to disable/enable the creation of vCLS - according to https://kb.vmware.com/s/article/80472

SrVMoussa
VMware Employee
VMware Employee

Great! Thanks for sharing this 

Regards,
Khalid Moussa
Reply
0 Kudos
kastrosmartis
Contributor
Contributor

Our vCLS VMs were marked as regular VMs in GUI. I hard stoped one by one and deleted it from disk. vCenter re-deployed new vCLS VMs  and thats it. Also these new vCLS VMs had correct icon (managed by....).

Hope it helps somebody also....