I am trying to enable SR-IOV on nvidia card but it complains as shown below. SR-IOV is enabled on BIOS .Please see attached screenshots.
haTask-ha-host-vim.host.PciPassthruSystem.updatePassthruConfig-1412871203
Update PCI passthrough device configuration
Failed - An error occurred during host configuration.
Ciao
Which is the ESXi version and nVIDIA Device Model?
Are is it your target to enable a VM to access a GPU directly?
Version is VMware ESXi 7.0.2 build-17867351 and nvidia model is A30.
My target is to run nvidia-smi command on ESXi but i am seeing error, seems nvidia driver is not loading
[root@localhost:~] nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
[root@localhost:~]
[root@localhost:~] esxcli software vib list | grep -i nvidia
NVIDIA-VMware_ESXi_7.0.2_Driver 510.47.03-1OEM.702.0.0.17630552 NVIDIA VMwareAccepted 2022-02-17
[root@localhost:~]
[root@localhost:~] vmkload_mod -l | grep nvidia
[root@localhost:~]
Error in dmesg
[root@localhost:~] dmesg | grep -i nvidia
2022-02-25T09:06:53.706Z cpu0:524288)Loading nvidia_b.v00...
2022-02-25T09:06:53.707Z cpu0:524288)VisorFSTar: 1871: nvidia_b.v00 for 0x6390082 bytes
2022-02-25T09:07:18.476Z cpu8:527286)SchedVsi: 2098: Group: host/vim/vmvisor/plugins/nvidia(20902): max=70 min=70 minLimit=unlimited shares=1000, units: mb
2022-02-25T09:07:18.481Z cpu15:525134)Starting service nvidia-init
2022-02-25T09:07:18.482Z cpu15:525134)Activating Jumpstart plugin nvidia-init.
2022-02-25T09:07:18.507Z cpu8:527308)ALERT: NVIDIA: module load failed during VIB install/upgrade.
2022-02-25T09:07:18.517Z cpu8:527312)NVIDIA: Starting vGPU Services.
2022-02-25T09:07:18.530Z cpu6:527315)NVIDIA: Starting Xorg service.
2022-02-25T09:07:19.183Z cpu11:527438)NVIDIA: Starting the DCGM node engine.
2022-02-25T09:07:19.545Z cpu3:527442)SchedVsi: 2098: Group: host/vim/vmvisor/NVIDIAHost(22172): min=128 max=128 minLimit=128, units: mb
2022-02-25T09:07:26.763Z cpu0:525134)Jumpstart plugin nvidia-init activated.
2022-02-25T09:07:27.766Z cpu5:527537)SchedVsi: 1016: Group nvidia could not be created: Already exists
[root@localhost:~]
I am trying MIG functionality of A30 for which SR-IOV has to be enabled. I have enabled in the BIOS but when I am enabling it on manage->hardware->nvidia its not getting enabled as shown this in attached screenshot
Ciao
There is something that does not come back to me.
The A30 does not seem compatible with vSphere in classic mode. See this compatibility matrix:
Supported Products :: NVIDIA Virtual GPU Software Documentation
While it seems to me that it's used for NVIDIA AI Enterprise + VMware solution that I don't know about, are you using this?
AI Software Suite for Enterprise IT (nvidia.com)
This is for the GRID it seems. but for MIG A30 is supported https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#supported-gpus
DO you know what could be the issue why SR-IOV is not getting enabled on GPU
Ciao
I am afraid of the misunderstanding, I have never worked with MIG mode and the indicated problem has never occurred to me.
The only suggestion is to check that IOMMU is also enabled in the machine's Bios, but I assume it is if you followed the VMware documentation.
on vsphere ESXi 7.0 U2 license side. i see i have evaluation license and in image i see its
In https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/products/vsphere/vmw-edition-compa... I see SR-IOV and nvidia GPU is not supported for standard but https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.esxi.upgrade.doc/GUID-17862A54-C1D4-47A9-88... says in evaluation we can explore all set of features in ESXi