VMware Cloud Community
rahulanandemc
Contributor
Contributor

Suggestion on NVIDIA driver for A30 Multi-instance GPU test on ESX .

I am following VMware doc https://blogs.vmware.com/apps/2020/09/vsphere-7-0-u1-with-multi-instance-gpus-mig-on-the-nvidia-a100...
for testing MIG on nvidia A30 card on ESXi 7 U2 but it doesnt work as driver isnt being loaded
and running nvidia-smi command fails with below error

[root@localhost:~] nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
[root@localhost:~]


I have SR-IOV enabled.
VT is enabled on BIOS


2022-02-28T16:49:35.027Z cpu27:2101475)VisorFSTar: 2020: nvidia_b.v00 (4652337235344062092) as nvidia_b.v00 for 81330306 bytes
2022-02-28T16:49:35.523Z cpu31:2101482)NVIDIA: Unloading nvidia module during vib install/upgrade.
2022-02-28T16:49:35.567Z cpu8:2101485)Mod: 4889: Unloading module <nvidia> ...
2022-02-28T16:49:36.274Z cpu30:2101491)ALERT: NVIDIA: module load failed during VIB install/upgrade.
2022-02-28T16:49:36.293Z cpu39:2101492)NVIDIA: Starting vGPU Services.
2022-02-28T16:49:36.321Z cpu0:2101495)NVIDIA: Starting Xorg service.
2022-02-28T16:49:37.490Z cpu8:2101577)NVIDIA: Starting the DCGM node engine.
2022-02-28T16:49:45.895Z cpu26:2101644)SchedVsi: 2098: Group: host/vim/vmviso

[root@localhost:~] vmkload_mod | grep -i nvidia
[root@localhost:~]


I have tried below vib with no result

NVIDIA_bootbank_NVIDIA-VMware_ESXi_7.0_Host_Driver_460.107-1OEM.700.0.0.15525992.vib
NVIDIA_bootbank_NVIDIA-VMware_ESXi_7.0_Host_Driver_450.142-1OEM.700.0.0.15525992.vib
NVIDIA_bootbank_NVIDIA-VMware_ESXi_7.0.2_Driver_510.47.03-1OEM.702.0.0.17630552.vib


Nvidia support told to use AI driver as A30 doesnt support vGPU.If thats the case then why A100 which is of same architecture as A30 is working with vGPU license shown in the link below?Please advice

https://blogs.vmware.com/apps/2020/09/vsphere-7-0-u1-with-multi-instance-gpus-mig-on-the-nvidia-a100...

 

Reply
0 Kudos
1 Reply
stadi13
Hot Shot
Hot Shot

Hi @rahulanandemc 

Refer to VMware Compatibility Guide your graphic card is supported in shared passthorugh mode only.

Enable the GPU for passthrough on the ESXi host.

  1. In the vSphere Client, right-click on the ESXi host and select Settings.
  2. On the Configure tab, select Hardware > PCI Devices, and click Configure Passthrough.
  3. In the Edit PCI Device Availability dialog box, in the ID column, select the check box for the GPU device.
  4. Click OK.

    The GPU is displayed on the Passthrough-enabled devices tab.

  5. Reboot the ESXi host.

https://www.vmware.com/resources/compatibility/detail.php?deviceCategory=sptg&productid=53903&device...

You also find the related driver version on this link.

Regards

Daniel

Reply
0 Kudos