==============NVSMI LOG============== Timestamp : Tue Feb 22 18:34:31 2022 Driver Version : 470.103.02 CUDA Version : Not Found Attached GPUs : 2 GPU 00000000:17:00.0 Product Name : NVIDIA A40 ...
See more...
==============NVSMI LOG============== Timestamp : Tue Feb 22 18:34:31 2022 Driver Version : 470.103.02 CUDA Version : Not Found Attached GPUs : 2 GPU 00000000:17:00.0 Product Name : NVIDIA A40 Product Brand : NVIDIA Display Mode : Enabled Display Active : Disabled Persistence Mode : Enabled MIG Mode Current : N/A Pending : N/A Accounting Mode : Enabled Accounting Mode Buffer Size : 4000 Driver Model Current : N/A Pending : N/A Serial Number : 1322221067050 GPU UUID : GPU-0be6efaf-fd2d-3fee-0b15-539707c2af4f Minor Number : 0 VBIOS Version : 94.02.5C.00.03 MultiGPU Board : No Board ID : 0x1700 GPU Part Number : 900-2G133-0100-030 Module ID : 0 Inforom Version Image Version : G133.0200.00.05 OEM Object : 2.0 ECC Object : 6.16 Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GSP Firmware Version : N/A GPU Virtualization Mode Virtualization Mode : Host VGPU Host VGPU Mode : SR-IOV IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0x17 Device : 0x00 Domain : 0x0000 Device Id : 0x223510DE Bus Id : 00000000:17:00.0 Sub System Id : 0x145A10DE GPU Link Info PCIe Generation Max : 4 Current : 1 Link Width Max : 16x Current : 16x Bridge Chip Type : N/A Firmware : N/A Replays Since Reset : 0 Replay Number Rollovers : 0 Tx Throughput : 0 KB/s Rx Throughput : 0 KB/s Fan Speed : 0 % Performance State : P8 Clocks Throttle Reasons Idle : Active Applications Clocks Setting : Not Active SW Power Cap : Not Active HW Slowdown : Not Active HW Thermal Slowdown : Not Active HW Power Brake Slowdown : Not Active Sync Boost : Not Active SW Thermal Slowdown : Not Active Display Clock Setting : Not Active FB Memory Usage Total : 48687 MiB Used : 0 MiB Free : 48687 MiB BAR1 Memory Usage Total : 65536 MiB Used : 1 MiB Free : 65535 MiB Compute Mode : Default Utilization Gpu : 0 % Memory : 0 % Encoder : 0 % Decoder : 0 % Encoder Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 FBC Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 Ecc Mode Current : Disabled Pending : Disabled ECC Errors Volatile SRAM Correctable : N/A SRAM Uncorrectable : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Aggregate SRAM Correctable : N/A SRAM Uncorrectable : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending Page Blacklist : N/A Remapped Rows Correctable Error : 0 Uncorrectable Error : 0 Pending : No Remapping Failure Occurred : No Bank Remap Availability Histogram Max : 192 bank(s) High : 0 bank(s) Partial : 0 bank(s) Low : 0 bank(s) None : 0 bank(s) Temperature GPU Current Temp : 30 C GPU Shutdown Temp : 98 C GPU Slowdown Temp : 95 C GPU Max Operating Temp : 88 C GPU Target Temperature : N/A Memory Current Temp : N/A Memory Max Operating Temp : N/A Power Readings Power Management : Supported Power Draw : 31.25 W Power Limit : 300.00 W Default Power Limit : 300.00 W Enforced Power Limit : 300.00 W Min Power Limit : 100.00 W Max Power Limit : 300.00 W Clocks Graphics : 210 MHz SM : 210 MHz Memory : 405 MHz Video : 555 MHz Applications Clocks Graphics : 1740 MHz Memory : 7251 MHz Default Applications Clocks Graphics : 1740 MHz Memory : 7251 MHz Max Clocks Graphics : 1740 MHz SM : 1740 MHz Memory : 7251 MHz Video : 1530 MHz Max Customer Boost Clocks Graphics : 1740 MHz Clock Policy Auto Boost : N/A Auto Boost Default : N/A Voltage Graphics : 712.500 mV Processes : None GPU 00000000:CA:00.0 Product Name : NVIDIA A40 Product Brand : NVIDIA Display Mode : Enabled Display Active : Disabled Persistence Mode : Enabled MIG Mode Current : N/A Pending : N/A Accounting Mode : Enabled Accounting Mode Buffer Size : 4000 Driver Model Current : N/A Pending : N/A Serial Number : 1322221062802 GPU UUID : GPU-bb45f5bc-a9a2-4727-6593-e676a1b5b5e4 Minor Number : 1 VBIOS Version : 94.02.5C.00.03 MultiGPU Board : No Board ID : 0xca00 GPU Part Number : 900-2G133-0100-030 Module ID : 0 Inforom Version Image Version : G133.0200.00.05 OEM Object : 2.0 ECC Object : 6.16 Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GSP Firmware Version : N/A GPU Virtualization Mode Virtualization Mode : Host VGPU Host VGPU Mode : SR-IOV IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0xCA Device : 0x00 Domain : 0x0000 Device Id : 0x223510DE Bus Id : 00000000:CA:00.0 Sub System Id : 0x145A10DE GPU Link Info PCIe Generation Max : 4 Current : 1 Link Width Max : 16x Current : 16x Bridge Chip Type : N/A Firmware : N/A Replays Since Reset : 0 Replay Number Rollovers : 0 Tx Throughput : 0 KB/s Rx Throughput : 0 KB/s Fan Speed : 0 % Performance State : P8 Clocks Throttle Reasons Idle : Active Applications Clocks Setting : Not Active SW Power Cap : Not Active HW Slowdown : Not Active HW Thermal Slowdown : Not Active HW Power Brake Slowdown : Not Active Sync Boost : Not Active SW Thermal Slowdown : Not Active Display Clock Setting : Not Active FB Memory Usage Total : 48687 MiB Used : 0 MiB Free : 48687 MiB BAR1 Memory Usage Total : 65536 MiB Used : 1 MiB Free : 65535 MiB Compute Mode : Default Utilization Gpu : 0 % Memory : 0 % Encoder : 0 % Decoder : 0 % Encoder Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 FBC Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 Ecc Mode Current : Disabled Pending : Disabled ECC Errors Volatile SRAM Correctable : N/A SRAM Uncorrectable : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Aggregate SRAM Correctable : N/A SRAM Uncorrectable : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending Page Blacklist : N/A Remapped Rows Correctable Error : 0 Uncorrectable Error : 0 Pending : No Remapping Failure Occurred : No Bank Remap Availability Histogram Max : 192 bank(s) High : 0 bank(s) Partial : 0 bank(s) Low : 0 bank(s) None : 0 bank(s) Temperature GPU Current Temp : 31 C GPU Shutdown Temp : 98 C GPU Slowdown Temp : 95 C GPU Max Operating Temp : 88 C GPU Target Temperature : N/A Memory Current Temp : N/A Memory Max Operating Temp : N/A Power Readings Power Management : Supported Power Draw : 32.84 W Power Limit : 300.00 W Default Power Limit : 300.00 W Enforced Power Limit : 300.00 W Min Power Limit : 100.00 W Max Power Limit : 300.00 W Clocks Graphics : 210 MHz SM : 210 MHz Memory : 405 MHz Video : 555 MHz Applications Clocks Graphics : 1740 MHz Memory : 7251 MHz Default Applications Clocks Graphics : 1740 MHz Memory : 7251 MHz Max Clocks Graphics : 1740 MHz SM : 1740 MHz Memory : 7251 MHz Video : 1530 MHz Max Customer Boost Clocks Graphics : 1740 MHz Clock Policy Auto Boost : N/A Auto Boost Default : N/A Voltage Graphics : 712.500 mV Processes : None
I've tried everything to get the vGPU VMs to power on but keep getting the error, Could not initialize plugin '/usr/lib64/vmware/plugin/libnvidia-vgx.so' for vGPU 'nvidia_a40-8q' 1. Disabled ECC Mem...
See more...
I've tried everything to get the vGPU VMs to power on but keep getting the error, Could not initialize plugin '/usr/lib64/vmware/plugin/libnvidia-vgx.so' for vGPU 'nvidia_a40-8q' 1. Disabled ECC Memory 2. Enabled SRIOV in the BIOS on the R750 host 3. Set Graphics mode to Shared Direct in VSphere