VMware Horizon Community
jbaker45
Contributor
Contributor

nVidia K2 Grid card issues

I'm getting the following error when trying to use the K260Q or K280Q grid profiles.

The amount of graphics resource available in the parent resource pool is insufficient for the operation.

Error Stack:

     An error was received from the ESX host while powering on VM.

     Failed to start the virtual machine.

     Module DevicePowerOn power on failed.

     Could not initialize plugin '/usr/lib64/vmware/plugin/libnvidia-vgx.so' for vGPU 'grid_k280q'.

     No graphics device is available for vGPU 'grid_k280q'.

We're running

Dell R730 with 384GB memory

vSphere esxi 6.0.0-4600944

Horizon View 7.0.3

Horizon View agent 7.0.3

Grid card VIB NVIDIA-vGPU-kepler-VMware_ESXi_6.0_Host_Driver  367.92-1OEM.600.0.0.2494585           NVIDIA    VMwareAccepted    2017-04-11

Desktop nVidia driver 369.95

The desktop VM is also only currently assigned 16GB of memory.

Even if i shutdown all vm's running on the host and try to start just one I get the error, they will only boot if I pick the K220Q or K2240Q profiles which are 512 MB and 1 GB respectively, I can not get the higher memory profiles to work no matter what I do.

***********************************************************************************************************************************************************************************************************************

6/19/2017 Update to our K2 card issues.

Finally gave up on all the suggestions that we're not resolving the problem or that we had already tried on our own.

Here's what finally fixed our problem.

Updated host to 6.0.0 Update 3 Build 5050593, tested all K2 profiles, all worked with no issues.

Updated host to 6.0.0 Build 5224934, tested all K2 profiles, all worked with no issues.

At this point we felt confident that all was going well so we continued to update the host to the latest build, read below for what happened.

Updated host to 6.0.0 Build 5572656, no VM with a grid card attached would start at all no matter what profile K2 profile was used. At this point I was completely pissed off and decided to call for support, I spent the next fours hours on the phone with two Dell support techs and one VMware support tech. Comically they all went through the same steps and possible solutions that I had already tried multiple times. While they we're debating what to do next I continued to do my own research and found this After upgrading ESXi hosts to ESXi600-201706001 Hardware 3D graphics functioning fails (2150498) | V... neither the Dell or VMware support techs we're aware of this KB article which is understandable since I called for help on the 15th and the article is dated the 14th. The only gotcha with this fix is that its not permanent, if you reboot the host you have to reapply the steps. I assume that the permanent fix will be rolled into the next patch build.

*************************************************************************************************************************************************************************************************************************

8 Replies
mhampto
VMware Employee
VMware Employee

Hello,

This seems to line up based on your symptoms so far: “Could not initialize plugin '/usr/lib64/vmware/plugin/libnvidia-vgx.so' for vGPU "profile_name"” er...

Let me know if this does not apply.

0 Kudos
jbaker45
Contributor
Contributor

That KB article only works for vSphere 6.5, we’re still on 6.0.

0 Kudos
thibAP
Contributor
Contributor

Hi

Your error message : " The amount of graphics resource available in the parent resource pool is insufficient for the operation." is due to already full assigned 3D ressources on your server (max of Vms running with your grid_k280q profile is reach.) you need to add more physical 3D ressources , with add more k2 on new server or upgrade to M6 or M60 your actual server.

do you have reserved memory on your VM?

pastedImage_1.png

0 Kudos
techguy129
Expert
Expert

When VM's with vGPU are added to the host, the VM gpu is assigned to a K2 card with the fewest VMs. This will fill up the cards so that you cannot use the bigger profiles.

You need to add this setting on your ESXi Server. I had to do it for my environment

Documentation for VMware Horizon 7 version 7.0

To improve virtual machine consolidation ratios, you can set the ESXi host to use consolidation mode. Edit the /etc/vmware/config file on the ESXi host and add the following entry:

vGPU.consolidation = "true"

By default, the ESXi host assigns virtual machines to the physical GPU with the fewest virtual machines already assigned. This is called performance mode. If you would rather have the ESXi host assign virtual machines to the same physical GPU until the maximum number of virtual machines is reached before placing virtual machines on the next physical GPU, you can use consolidation mode.

0 Kudos
iooy
Contributor
Contributor

Hello

My environment is 6.5, but I hope it will be helpful.

・GRID K2

・vSphere 6.5

Please check Figure 3 Example vGPU configurations on GRID K2 at the following URL.

http://images.nvidia.com/content/grid/pdf/GRID-vGPU-User-Guide.pdf

0 Kudos
NextFinish
Contributor
Contributor

I had the same issue

ESXi 6.0 5572565

Nvidia Grid K2

Driver 367.106 vGPU

I noticed that after the driver update the xorg service would not start.  I followed this link and it resolved the issue

After upgrading ESXi hosts to ESXi600-201706001 Hardware 3D graphics functioning fails (2150498) | V...

Jonathan_Filipp
Contributor
Contributor

I logged a job and tech suggested upgrade to 6.5... ended up rolling back to 6.0 u3 since 6.5 had management issues - hosts become unresponsive to management which meant all VM's were failing to refresh on log off!! = all offline!.. 2 hosts are not 6.5 certified but one is and even that host showed the same issues!

My job was logged the 9th, the fix came out the 14th then we closed the job on the 16th after i rolled back to U3.. would have loved to have known this instead of spending 20 hours troubleshooting and rolling back updates.......

Thanks Vmware! 😕 I enjoy working weekends fixing silly problems...

0 Kudos
MikleF
Enthusiast
Enthusiast

I had the same issue

ESXi 6.0 5572565

Nvidia Grid K2

Driver 367.106 vGPU

I noticed that after the driver update the xorg service would not start.  I followed this link and it resolved the issue

After upgrading ESXi hosts to ESXi600-201706001 Hardware 3D graphics functioning fails (2150498) | V...

Did a reboot and now this doesn't seem to work anymore.

0 Kudos