VMware Cloud Community
rahulanandemc
Contributor
Contributor

Failed to configure SR-IOV for device nVidia device ,it shows 0 function while enabling

I am trying to enable SR-IOV on nvidia card but it complains  as shown below. SR-IOV is enabled on BIOS .Please see attached screenshots.

 
 
Error:
Update Passthru Config
Key

haTask-ha-host-vim.host.PciPassthruSystem.updatePassthruConfig-1412871203

Description

Update PCI passthrough device configuration

State

Failed - An error occurred during host configuration.

Errors
 
 

Screenshot 2022-02-25 at 3.19.00 PM.png

 

Screenshot 2022-02-25 at 3.20.27 PM.png

Screenshot 2022-02-25 at 3.21.02 PM.png

 

 

Labels (1)
0 Kudos
9 Replies
fabio1975
Commander
Commander

Ciao

Which is the ESXi version and nVIDIA Device Model?

Are is it your target to enable a VM to access a GPU directly?

 

Fabio

Visit vmvirtual.blog
If you're satisfied give me a kudos

0 Kudos
rahulanandemc
Contributor
Contributor

Version is  VMware ESXi 7.0.2 build-17867351 and nvidia model is A30.

My target is to run nvidia-smi command on ESXi but i am seeing error, seems nvidia driver is not loading 

[root@localhost:~] nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

[root@localhost:~]

 

[root@localhost:~] esxcli software vib list | grep -i nvidia
NVIDIA-VMware_ESXi_7.0.2_Driver 510.47.03-1OEM.702.0.0.17630552 NVIDIA VMwareAccepted 2022-02-17
[root@localhost:~]

[root@localhost:~] vmkload_mod -l | grep nvidia
[root@localhost:~]

Error in dmesg

[root@localhost:~] dmesg | grep -i nvidia
2022-02-25T09:06:53.706Z cpu0:524288)Loading nvidia_b.v00...
2022-02-25T09:06:53.707Z cpu0:524288)VisorFSTar: 1871: nvidia_b.v00 for 0x6390082 bytes
2022-02-25T09:07:18.476Z cpu8:527286)SchedVsi: 2098: Group: host/vim/vmvisor/plugins/nvidia(20902): max=70 min=70 minLimit=unlimited shares=1000, units: mb
2022-02-25T09:07:18.481Z cpu15:525134)Starting service nvidia-init
2022-02-25T09:07:18.482Z cpu15:525134)Activating Jumpstart plugin nvidia-init.
2022-02-25T09:07:18.507Z cpu8:527308)ALERT: NVIDIA: module load failed during VIB install/upgrade.
2022-02-25T09:07:18.517Z cpu8:527312)NVIDIA: Starting vGPU Services.
2022-02-25T09:07:18.530Z cpu6:527315)NVIDIA: Starting Xorg service.
2022-02-25T09:07:19.183Z cpu11:527438)NVIDIA: Starting the DCGM node engine.
2022-02-25T09:07:19.545Z cpu3:527442)SchedVsi: 2098: Group: host/vim/vmvisor/NVIDIAHost(22172): min=128 max=128 minLimit=128, units: mb
2022-02-25T09:07:26.763Z cpu0:525134)Jumpstart plugin nvidia-init activated.
2022-02-25T09:07:27.766Z cpu5:527537)SchedVsi: 1016: Group nvidia could not be created: Already exists
[root@localhost:~]

 

Tags (1)
0 Kudos
rahulanandemc
Contributor
Contributor

I am trying MIG functionality of A30 for which SR-IOV has to be enabled. I have enabled in the BIOS but when I am enabling it on manage->hardware->nvidia its not getting enabled  as shown this in attached screenshot

0 Kudos
fabio1975
Commander
Commander

Ciao 

There is something that does not come back to me.
The A30 does not seem compatible with vSphere in classic mode. See this compatibility matrix:

Supported Products :: NVIDIA Virtual GPU Software Documentation

fabio1975_0-1645784832501.png

 

While it seems to me that it's used for NVIDIA AI Enterprise + VMware solution that I don't know about, are you using this?

AI Software Suite for Enterprise IT (nvidia.com)

 

Fabio

Visit vmvirtual.blog
If you're satisfied give me a kudos

0 Kudos
rahulanandemc
Contributor
Contributor

This is for the GRID it seems. but for MIG A30 is supported https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#supported-gpus 

DO you know what could be the issue why SR-IOV is not getting enabled on GPU

 

 

0 Kudos
fabio1975
Commander
Commander

Ciao 

I am afraid of the misunderstanding, I have never worked with MIG mode and the indicated problem has never occurred to me.

The only suggestion is to check that IOMMU is also enabled in the machine's Bios, but I assume it is if you followed the VMware documentation.

Fabio

Visit vmvirtual.blog
If you're satisfied give me a kudos

0 Kudos
rahulanandemc
Contributor
Contributor

on vsphere ESXi 7.0 U2 license side. i see i have evaluation license and in image i see its 

Image profile
(Updated) ESXi-7.0U2a-17867351-standard (VMware, Inc.)
 
Can you suggest if this is supported for SR-IOV and Nvidia 
 
Screenshot 2022-02-25 at 6.33.19 PM.png

 

 

0 Kudos
rahulanandemc
Contributor
Contributor

0 Kudos
SamBGB1
Contributor
Contributor

Hello,

 

I wondered if you resolved this @rahulanandemc ?

0 Kudos