rebelfalls
VMware Employee
VMware Employee

vgpu vms stuck at 19% will not migrate if host is put in maintenance mode

Jump to solution

When I try to put the host in maintenance, virtual machines with vgpu get stuck at 19%. If I manually perform live migration the virtual machines with vgpu will move without any issue.

I am unable to to locate any specific document which says DRS not supported with vgpu. However as per document below found that DRS support only initial placement of vm with vgpu  https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.vcenterhost.doc/GUID-8FE6A0DA-49E9-...

I have conformed as per nvidia compatibility matrix nvidia tesla M10 does support vmotion https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.vcenterhost.doc/GUID-8FE6A0DA-49E9-...https://docs.nvidia.com/grid/10.0/grid-vgpu-release-notes-vmware-vsphere/index.html#hardware-configu...

I have troubleshooted and found that the vgpu.hotmigrate.enabled paramater is set to true, which it is. Unsure of where to go from here.

4 node cluster -- VXrail ESXi version 6.7 VSAN 6.7 Vmware horizon 7.11

1 Solution

Accepted Solutions
Lalegre
Commander
Commander

I think your issue is the next one:

"DRS supports initial placement of vGPU VMs running vSphere 6.7 Update 1 and later without load balancing support"

https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vcenterhost.doc/GUID-8FE6A0DA-49E9-...

View solution in original post

3 Replies
Lalegre
Commander
Commander

I think your issue is the next one:

"DRS supports initial placement of vGPU VMs running vSphere 6.7 Update 1 and later without load balancing support"

https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vcenterhost.doc/GUID-8FE6A0DA-49E9-...

View solution in original post

rebelfalls
VMware Employee
VMware Employee

Thank you for your response Lalegre.

So is my only option to manually live migrate my vgpu vm's if I want to put my host in maintenance mode? 

Lalegre
Commander
Commander

Guess you have no option at the moment. That documentaiton as you can see is for vSphere 7 so i think so far is the only way to do it.

Glad it helped!