digerati646
Enthusiast
Enthusiast

Adding Tesla M6 to Cisco B200 M4, plugin error in vmware

Jump to solution

I am trying to install my new Tesla M6 GPU card in my ESXi 6.0 host. I installed the nvidia software on the host and that looks ok. I can add the GPU card to my VM in vcenter as well BUT when I try to turn the VM on, I get the following error:

Could not initialize plugin '/usr/lib64/vmware/plugin/libnvidia-vgx.so' for vGPU 'grid_m6-8q'

I tried every other GPU profile and I get the same error for each GPU profile. I updated my ESXi 6.0 host to UP2 and that didn't seem to have any effect. I am running host driver version 352.83.

I have also checked the compatability list for this tesla M6 card and my cisco b200 M4 is listed there as compatable. 

For troubleshooting, I have ran the following commands:

[root@VH1:~] nvidia-smi

Tue Apr  5 21:47:42 2016

+------------------------------------------------------+

| NVIDIA-SMI 352.83     Driver Version: 352.83         |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M6            On   | 0000:81:00.0     Off |                    0 |

| N/A   47C    P8    16W / 100W |     30MiB /  7679MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID  Type  Process name                               Usage      |

|=============================================================================|

|  No running processes found                                                 |

+-----------------------------------------------------------------------------+

Looks good to me?

I have also ran:

[root@VH1:~] dmesg | grep -E "NVRM|nvidia"

2016-04-04T17:05:16.417Z cpu3:33421)Loading module nvidia ...

2016-04-04T17:05:16.423Z cpu3:33421)Elf: 1865: module nvidia has license NVIDIA

NVRM: vmk_MemPoolCreate passed for 4194304 pages.

NVRM: loading NVIDIA UNIX x86_64 Kernel Module  352.83  Sun Feb  7 20:16:36 PST 2016

2016-04-04T17:05:16.754Z cpu3:33421)Device: 191: Registered driver 'nvidia' from 20

2016-04-04T17:05:16.754Z cpu3:33421)Mod: 4943: Initialization of nvidia succeeded with module ID 20.

2016-04-04T17:05:16.754Z cpu3:33421)nvidia loaded successfully.

2016-04-04T17:05:17.553Z cpu29:33420)Device: 326: Found driver nvidia for device 0x36554304c6e6377b

NVRM: nvidia_associate vmgfx0

2016-04-04T17:08:15.323Z cpu2:35277)IntrCookie: 1915: cookie 0x3d moduleID 20 <nvidia> exclusive, flags 0x1d

Any other ideas on this?  Or anything else to try?

Thanks in advance

0 Kudos
1 Solution

Accepted Solutions
digerati646
Enthusiast
Enthusiast

After some digging, I figured this out.  Looks like the Tesla M6 and Telsa M60 get shipped in a "compute" mode which is not compatible with VMware or other hypervisors.  SO, you have to download a gpumodeswitch utility, boot the server with that utility, and change the mode of the card to "graphics".  Hopefully this will help a few people out that run into this!  Check out nvidia's website for more info on their GPUmodeswitch utility.

View solution in original post

0 Kudos
1 Reply
digerati646
Enthusiast
Enthusiast

After some digging, I figured this out.  Looks like the Tesla M6 and Telsa M60 get shipped in a "compute" mode which is not compatible with VMware or other hypervisors.  SO, you have to download a gpumodeswitch utility, boot the server with that utility, and change the mode of the card to "graphics".  Hopefully this will help a few people out that run into this!  Check out nvidia's website for more info on their GPUmodeswitch utility.

View solution in original post

0 Kudos