We have a Horizon vdi infrastructure running vSAN, 150 VMs with nVidia GRID graphics, and 4 desktop pools. 1 desktop pool has no nVidia profile configured, and the other 3 have GRID M10-1Q as the vGPU profile.
I'm noticing that even with all of the pools created in Horizon 7.7 which was labeled as having the ability to perform DRS balancing, we still have times where the VMs are balanced unevenly enough to cause memory and CPU warnings. It's pretty common that we have 3 hosts running at about 20% CPU/Memory and 1 unit running at 60-80% (screenshot attached). The number of VMs is very unbalanced also. Some hosts will have 50-60 and others only have 15. These numbers vary, that's just rough numbers to explain the normal behavior.
My question is...are there any guides on how to configure DRS correctly for Horizon? I've never seen an automatic DRS vmotion happen even when these warnings occur, so I don't know if I have it configured correctly. I can live migrate these VMs manually even with the vGPU profiles, so I know it's possible for vCenter to perform these moves. I'm just wondering if I have the DRS settings incorrect, and was wondering if there is any documentation of the best practices for the configuration.
In case anyone else has this question, I was informed by the vmware vsphere/vcenter team that vGPU live migration is supported, but DRS is not. DRS currently provides the recommendations for the migrations needed, but will not automatically apply those recommendations.
It requires manually pushing the "Apply Recommendations".
I'm amazed at this answer since the functionality is obviously built in, but not supported. I'm looking into writing a script to apply these recommendations when they exist as a workaround.
I could see the same behavior in our environment.
Live migration of vGPU VMs works but not the DRS. So if I set one host in the cluster to the maintenance mode all VMs expect vGPU VMs will be automatically migrated to other hosts.
I have to mark all vGPU VMs and move it manually to other hosts in the same cluster on VSAN datastore. The wizard brings a compatibility warning but it will completes without issues.
I hope vmware can handle it in the near future.