telecom_tesla
Contributor
Contributor

Error: Module 'DevicePowerOn' power on failed when adding PCI GPU (NVIDIA Tesla V100)

Jump to solution

Hi,

the company I work for has contracted a dedicated server in the cloud featuring a NVIDIA V100 that we want to use for AI/Big Data.

In that server we have installed VMware ESXi 6.7U3, and on top of that a VM with Windows Server 2016.

The idea is to passthrough the V100 to the Windows Server, but I am finding some problems.

I ensured that the GPU has Passthrough as Active and I created the VM and added the V100 as PCI device:

active.PNG

+ At the beginning I was having the error "PCI passthrough devices cannot be added when Nested Hardware-Assisted Virtualization is enabled", so after looking on the Internet about that, for that VM I disabled "Virtualization Based Security" (VBS), "Expose hardware assisted virtualization to the guest OS" and "Expose IOMMU to the guest OS".

pci_passthrough.PNG

+ Now the error message disappeared but I got a new one!, when turning on the VM it displays: "Module 'DevicePowerOn' power on failed":

device_power_on.PNG

gpu.PNG

memory.PNG

Any idea why this is happening?

Thanks a lot in advance!

1 Solution

Accepted Solutions
telecom_tesla
Contributor
Contributor

Solved! In addition to all the previous steps that I performed I had to add two configuration parameters:

pciPassthru.use64bitMMIO=”TRUE”

pciPassthru.64bitMMIOSizeGB=32 (because our V100 has 16GB memory)

Everything is described in the following VMware blog: Using GPUs with Virtual Machines on vSphere - Part 2: VMDirectPath I/O - Virtualize Applications

Now after installing the Windows NVIDIA driver in my VM I can see and use the NVIDIA V100:

gpu-vm.png

Thanks anyway for your help guys Smiley Happy

View solution in original post

5 Replies
larstr
Champion
Champion

Telecom Tesla,

I don't know if you're using UEFI BIOS or not, but you may need to specify some advanced settings:

VMware vSphere VMDirectPath I/O: Requirements for Platforms and Devices (2142307)

Lars

0 Kudos
telecom_tesla
Contributor
Contributor

Dear Lars,

yes, the server makes use of UEFI (firmware = "efi" in the .vmx file).

I also checked and the server is compatible with the version of ESXi that I am using:

compatibility.PNG

Best,

Pablo

0 Kudos
tiagoademay
Contributor
Contributor

Hello,

sorry for my English

If I'm not mistaken, the problem may be the amount of memory RAM, should be the same as the V100 card = 16Gb or 16384 Mb

0 Kudos
telecom_tesla
Contributor
Contributor

I have the same error with 16GB of RAM 😞

11111111.PNG

2222222.PNG

0 Kudos
telecom_tesla
Contributor
Contributor

Solved! In addition to all the previous steps that I performed I had to add two configuration parameters:

pciPassthru.use64bitMMIO=”TRUE”

pciPassthru.64bitMMIOSizeGB=32 (because our V100 has 16GB memory)

Everything is described in the following VMware blog: Using GPUs with Virtual Machines on vSphere - Part 2: VMDirectPath I/O - Virtualize Applications

Now after installing the Windows NVIDIA driver in my VM I can see and use the NVIDIA V100:

gpu-vm.png

Thanks anyway for your help guys Smiley Happy

View solution in original post