- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When the user experiences a locked up session, don't restart the problem VM immediately. First SSH to the host the VM is running on and run "nvidia-smi vgpu". Do you see the impacted virtual machine showing 99% GPU Utilization?
If so, you are experiencing what we have been calling the "99% issue". We have fought this since mid-2022 with no real insight from VMWare or NVidia on the root cause. We did see it where one VM experiencing this issue would lock up the sessions of all the desktops on that same GPU card. Resetting the problem VM would allow the other desktops to be reconnected to.
In the end, we switched all of our vGPU profiles from GRID T4-1B to GRID T4-2B, and the problem went away. Obviously, you may not be able to just double your assigned GPU frame buffer if you don't have enough physical GPU's in your hosts. We were lucky in that we only had to add one additional server to our cluster to be able to do this.
Just curious, what AutoDesk product are your users primarily working with on their virtual desktops?
Good luck.